BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cdunlea
Fluorite | Level 6

Hi all, I have the following two datasets I would like to merge.

ID_1CODE
1005911
1005936
100777
1007743
1023329
1023331
1043732

 

ID_2CODES
1005938
1005941
1007721
1007723
1007725
1023329
1043732

 

When merging currently using the following code:

DATA Merged;
merge wave1 wave2 ; by id;
run;

It comes out looking like this:

 idcodecodes
1100591138
2100593641
310077721
4100774325
5100774323
6102332929

 

Whereas this is what I would like it to look like. if the two "code" columns do not match I would like them to be in separate rows. Similar to stacking the data. However if they are MATCHING, I would like them to be merged in the same row (highlighted in red).

 idcodecodes
11005911.
210059.38
31005936.
410059.41
5100777.
610077.21
71007743.
810077.25
91007743.
1010077.23
11102332929
121023331.
1310233.29
14104373232

 

Any ideas would be appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

Add next steps to previous code:

data  id2del;
   set want;  /*output of previous step */
        if code=codes;
        keep ID;
run;
proc sql; 
      create table want_new
      as select * from want
      where code ne codes and
        id in (select ID from id2del);
quit;

View solution in original post

10 REPLIES 10
Shmuel
Garnet | Level 18

Try next code:

data both / view=both;
  set table1 table2;
run;
proc sort data=both out=want nodupkey;
  by id code codes;
run;
cdunlea
Fluorite | Level 6

Thanks for the reply, but this code results in the matching observations still being in separate rows. 

The output looks good for those that do not match but it does not merge observations with the same value in each column (seen in red in original post). 

 

Thanks again!

Shmuel
Garnet | Level 18

Then you need add some steps to your code:

DATA Merged;
merge wave1 wave2 ; 
by id; run; data want; set merged; retain _code; drop _code; if code=codes then output; else do; _code = codes; _codes = .; output; codes = _code; code=.; output; end; run;
cdunlea
Fluorite | Level 6

Thanks! That almost works! 

The only thing is that I still want this "38" in the first row to be coded as missing and only found in row 2. (Or the "41 in row 3 to be coded as "." and only found in row 4.) How can I specify this? 

 idcodecodes
110059138
210059.38
3100591041
410059.41

 

Colleen

Shmuel
Garnet | Level 18

There was a typo in my code. Code tested should be:

DATA Merged;
merge wave1 wave2 ;   by id;
run;
data want;
 set merged;
     retain _code; drop _code;
     if code=codes then output; 
     else do;
           _code = codes;
           codes = .;
           output;
           codes = _code;
           code=.;
           output;
     end;
run;
cdunlea
Fluorite | Level 6

This one almost works. However, it seems to be only working about half the time. It kept row 12, which should have been deleted. But in row 14, it did it perfectly. Perhaps it is the length of the observations in the "code" columns?

 idcodecodes
1210233.242930
1310233242930242930
141043732343234
cdunlea
Fluorite | Level 6

If the two "code" columns do not match I would like them to be in separate rows. This works.

 

If they are MATCHING codes, I would like them to be merged in the same row and the other row to be deleted (highlighted in red) so that it appears like row 14 here, just once. 

 idcodecodes
1210233.242930
1310233242930242930
141043732343234
Shmuel
Garnet | Level 18

Add next steps to previous code:

data  id2del;
   set want;  /*output of previous step */
        if code=codes;
        keep ID;
run;
proc sql; 
      create table want_new
      as select * from want
      where code ne codes and
        id in (select ID from id2del);
quit;
Ksharp
Super User
data a;
infile cards expandtabs;
input ID	CODE;
cards;
10059	11
10059	36
10077	7
10077	43
10233	29
10233	31
10437	32
;
 
data b;
infile cards expandtabs;
input ID	CODES;
cards;
10059	38
10059	41
10077	21
10077	23
10077	25
10233	29
10437	32
;

data temp;
 merge a b;
 by id ;
run;
data want;
 set temp;
 if code ne codes then do;
  temp=codes;
  call missing(codes);
  output;
  codes=temp;
  call missing(code);
  output;
 end;
 else output;
 drop temp;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 911 views
  • 0 likes
  • 3 in conversation