BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
braverju
Obsidian | Level 7
Thank everybody for the previous help. 
I have one more question. 
 

I am trying to merge two datasets: 

 
data dataSet1;
 type1 something1
 type1 something2
 type1 something3 
 type2 something1
 type2 something2
 type2 something3 
 type3 something1
 type3 something2
 type3 something3 
 
data types;
  input myText textType;
  datalines;
   a  type1 
   b  type2 
   c type3
run;
 
I would like to merge these two to get
 
 type1 something1 a
 type1 something2 a
 type1 something3 a 
 type2 something1 b
 type2 something2 b
 type2 something3 b 
 type3 something1 c
 type3 something2 c
 type3 something3 c
 
After this I can get rid of Type variable completely. So, basically, I would like to replace type1 by the corresponding 'a', type2 by the corresponding 'b' etc.
 
The simple code: 
 
 data dataset1;
  merge dataset1 types;
  BY textType;
RUN;  
 
creates a strange result: 
 
 type1 something1 a
 type1 something2 
 type1 something3  
 type2 something1 b
 type2 something2 
 type2 something3  
 type3 something1 c
 type3 something2 
 type3 something3 
 
(so, textType appears only  first time and all the rest are missing values. What is wrong here? What is the most efficient way to do it?  
 
Thank you very much
 
Julia
1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

It's a combination of two factors.  First, it's likely you tested and re-tested, trying to get this to work.  And second, you are reusing the name DATASET1 as both the input and the output.

 

All of that is legal.  However, it likely created the situation where DATASET1 already contains MYTEXT (possibly with all missing values at some point along the way).  Get rid of it.

 

data dataset1;
  merge dataset1 (drop=MYTEXT) types;
  BY textType;
RUN;  
 
That should fix the problem.

View solution in original post

6 REPLIES 6
ballardw
Super User

Order of sets on the MERGE statement makes a difference:

data combined;
   merge types dataset1;
   by texttype;
run;

But Many to many is not going to work as desired many types.

 

 

What you are doing is a JOIN. Merge is doing exactly what the documentation says it will.

Another way that is more flexible is something like this. Also the JOIN using SQL does not require sorting the data first.

proc sql;
   create table combined as 
   select dataset1.*, types.mytext
   from dataset1 left join types
      on dataset1.texttype = types.texttype;
quit;

BTW you dataset code for TYPES has an error as MYTEXT is considered numeric as you haven't defined it as character and will generate missing valures for MYTEXT and error messages of invalid data.

 

braverju
Obsidian | Level 7
Thank you for the explanation! Got it!
DanZ
Obsidian | Level 7
 data dataset1;
  merge dataset1 types;
  BY textType something;
RUN;  

You need to specify the key of the somethingX field.

mohamed_zaki
Barite | Level 11

Why!!??

mohamed_zaki
Barite | Level 11

Your code is fine and should work...

 

data dataSet1;
input textType $ 1-5  textSome $ 7-17 ;
cards;
type1 something1
type1 something2
type1 something3 
type2 something1
type2 something2
type2 something3 
type3 something1
type3 something2
type3 something3 
 ;run;
 
data types;
  input myText $ textType $3-7;
  datalines;
a type1 
b type2 
c type3
   ;
run;
proc sort data=dataset1; by textType; run;
proc sort data=types; by textType; run;

data dataset1;
  merge dataset1 types;
  BY textType;
  drop textType;
RUN;  

Try run  the code above, you did not show your full code.

Astounding
PROC Star

It's a combination of two factors.  First, it's likely you tested and re-tested, trying to get this to work.  And second, you are reusing the name DATASET1 as both the input and the output.

 

All of that is legal.  However, it likely created the situation where DATASET1 already contains MYTEXT (possibly with all missing values at some point along the way).  Get rid of it.

 

data dataset1;
  merge dataset1 (drop=MYTEXT) types;
  BY textType;
RUN;  
 
That should fix the problem.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 2830 views
  • 1 like
  • 5 in conversation