I would like some help understanding how merge works/how to adequately apply it to my data. Please see my code below and the desired results I am trying to achieve. Thank you.
data set1;
input var1 var2 var3$;
datalines;
1 10 a
4 14 b
5 15 c
run;
data set2;
input var4 var5 var3$;
datalines;
2 15 a
3 45 a
4 52 a
3 25 b
1 10 b
3 17 b
4 20 c
5 30 c
4 11 c
run;
data set3;
merge set1 set2;
by var3;
run;
/* Actual results*/
1 10 a 2 15
1 10 a 3 45
1 10 a 4 52
4 14 b 3 25
4 14 b 1 10
4 14 b 3 17
5 15 c 4 20
5 15 c 5 30
5 15 c 4 11
/*desired results*/
1 10 a 2 15
. . a 3 45
. . a 4 52
4 14 b 3 25
. . b 1 10
. . b 3 17
5 15 c 4 20
. . c 5 30
. . c 4 11
You could use something like:
data set1;
input var0$ var1 var2 var3$;
datalines;
m 1 10 a
t 12 5 a
w 3 6 a
m 4 14 b
t 7 10 b
w 4 5 b
m 5 15 c
t 6 16 c
w 7 6 c
;
data set2;
input var4 var5 var3$;
datalines;
2 15 a
3 45 a
4 52 a
10 11 a
94 10 a
15 19 a
3 25 b
1 10 b
3 17 b
15 18 b
11 12 b
80 70 b
4 20 c
5 30 c
4 11 c
18 17 c
20 20 c
50 15 c
;
data set1;
set set1;
by var3;
if first.var3 then counter=1;
else counter+1;
run;
data set2;
set set2;
by var3;
if first.var3 then counter=1;
else counter+1;
run;
data want (drop=counter);
merge set1 set2;
by var3 counter;
run;
Please try the below code;
data set3;
merge set1 set2;
by var3;
if first.var3=0 then do;
var1=.;
var2=.;
end;
run;
Thanks,
Jag
The solution looks fine. A few notes ...
You may find this paper useful: http://www.ats.ucla.edu/stat/sas/library/nesug99/ad155.pdf
Also you may not need to blank out the variables for printing purposes. Even PROC PRINT can produce a report without repeating VAR1 and VAR2. For example, you can experiment with this:
proc print data=set3;
by var3 var1 var2;
id var3 var1 var2;
var var4 var5;
run;
Good luck.
my final code is using proc report, I was able to use the id on most of my variables but one of my variables keeps grouping as it doesn't have unique values.
Thank you! This worked except I am more concerned about the last variable continually repeating itself.so a better example would be.
data set1;
input var0$ var1 var2 var3$;
datalines;
m 1 10 a
t 12 5 a
w 3 6 a
m 4 14 b
t 7 10 b
w 4 5 b
m 5 15 c
t 6 16 c
w 7 6 c
run;
data set2;
input var4 var5 var3$;
datalines;
2 15 a
3 45 a
4 52 a
10 11 a
94 10 a
15 19 a
3 25 b
1 10 b
3 17 b
15 18 b
11 12 b
80 70 b
4 20 c
5 30 c
4 11 c
18 17 c
20 20 c
50 15 c
run;
with the desire results being...
m | 1 | 10 | a | 2 | 15 |
t | 12 | 5 | a | 3 | 45 |
w | 3 | 6 | a | 4 | 52 |
. | . | a | 10 | 11 | |
. | a | 94 | 10 | ||
. | . | a | 15 | 19 | |
m | 4 | 14 | b | 3 | 25 |
t | 7 | 10 | b | 1 | 10 |
w | 4 | 5 | b | 3 | 17 |
. | . | b | 15 | 18 | |
. | . | b | 11 | 12 | |
. | . | b | 80 | 70 | |
m | 5 | 15 | c | 4 | 20 |
t | 6 | 16 | c | 5 | 30 |
w | 7 | 6 | c | 4 | 11 |
. | . | c | 18 | 17 | |
. | . | c | 20 | 20 | |
. | . | c | 50 | 15 |
Thank you for your help
You could use something like:
data set1;
input var0$ var1 var2 var3$;
datalines;
m 1 10 a
t 12 5 a
w 3 6 a
m 4 14 b
t 7 10 b
w 4 5 b
m 5 15 c
t 6 16 c
w 7 6 c
;
data set2;
input var4 var5 var3$;
datalines;
2 15 a
3 45 a
4 52 a
10 11 a
94 10 a
15 19 a
3 25 b
1 10 b
3 17 b
15 18 b
11 12 b
80 70 b
4 20 c
5 30 c
4 11 c
18 17 c
20 20 c
50 15 c
;
data set1;
set set1;
by var3;
if first.var3 then counter=1;
else counter+1;
run;
data set2;
set set2;
by var3;
if first.var3 then counter=1;
else counter+1;
run;
data want (drop=counter);
merge set1 set2;
by var3 counter;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.