I would like some help understanding how merge works/how to adequately apply it to my data. Please see my code below and the desired results I am trying to achieve. Thank you.
data set1;
input var1 var2 var3$;
datalines;
1 10 a
4 14 b
5 15 c
run;
data set2;
input var4 var5 var3$;
datalines;
2 15 a
3 45 a
4 52 a
3 25 b
1 10 b
3 17 b
4 20 c
5 30 c
4 11 c
run;
data set3;
merge set1 set2;
by var3;
run;
/* Actual results*/
1 10 a 2 15
1 10 a 3 45
1 10 a 4 52
4 14 b 3 25
4 14 b 1 10
4 14 b 3 17
5 15 c 4 20
5 15 c 5 30
5 15 c 4 11
/*desired results*/
1 10 a 2 15
. . a 3 45
. . a 4 52
4 14 b 3 25
. . b 1 10
. . b 3 17
5 15 c 4 20
. . c 5 30
. . c 4 11
You could use something like:
data set1;
input var0$ var1 var2 var3$;
datalines;
m 1 10 a
t 12 5 a
w 3 6 a
m 4 14 b
t 7 10 b
w 4 5 b
m 5 15 c
t 6 16 c
w 7 6 c
;
data set2;
input var4 var5 var3$;
datalines;
2 15 a
3 45 a
4 52 a
10 11 a
94 10 a
15 19 a
3 25 b
1 10 b
3 17 b
15 18 b
11 12 b
80 70 b
4 20 c
5 30 c
4 11 c
18 17 c
20 20 c
50 15 c
;
data set1;
set set1;
by var3;
if first.var3 then counter=1;
else counter+1;
run;
data set2;
set set2;
by var3;
if first.var3 then counter=1;
else counter+1;
run;
data want (drop=counter);
merge set1 set2;
by var3 counter;
run;
Please try the below code;
data set3;
merge set1 set2;
by var3;
if first.var3=0 then do;
var1=.;
var2=.;
end;
run;
Thanks,
Jag
The solution looks fine. A few notes ...
You may find this paper useful: http://www.ats.ucla.edu/stat/sas/library/nesug99/ad155.pdf
Also you may not need to blank out the variables for printing purposes. Even PROC PRINT can produce a report without repeating VAR1 and VAR2. For example, you can experiment with this:
proc print data=set3;
by var3 var1 var2;
id var3 var1 var2;
var var4 var5;
run;
Good luck.
my final code is using proc report, I was able to use the id on most of my variables but one of my variables keeps grouping as it doesn't have unique values.
Thank you! This worked except I am more concerned about the last variable continually repeating itself.so a better example would be.
data set1;
input var0$ var1 var2 var3$;
datalines;
m 1 10 a
t 12 5 a
w 3 6 a
m 4 14 b
t 7 10 b
w 4 5 b
m 5 15 c
t 6 16 c
w 7 6 c
run;
data set2;
input var4 var5 var3$;
datalines;
2 15 a
3 45 a
4 52 a
10 11 a
94 10 a
15 19 a
3 25 b
1 10 b
3 17 b
15 18 b
11 12 b
80 70 b
4 20 c
5 30 c
4 11 c
18 17 c
20 20 c
50 15 c
run;
with the desire results being...
m | 1 | 10 | a | 2 | 15 |
t | 12 | 5 | a | 3 | 45 |
w | 3 | 6 | a | 4 | 52 |
. | . | a | 10 | 11 | |
. | a | 94 | 10 | ||
. | . | a | 15 | 19 | |
m | 4 | 14 | b | 3 | 25 |
t | 7 | 10 | b | 1 | 10 |
w | 4 | 5 | b | 3 | 17 |
. | . | b | 15 | 18 | |
. | . | b | 11 | 12 | |
. | . | b | 80 | 70 | |
m | 5 | 15 | c | 4 | 20 |
t | 6 | 16 | c | 5 | 30 |
w | 7 | 6 | c | 4 | 11 |
. | . | c | 18 | 17 | |
. | . | c | 20 | 20 | |
. | . | c | 50 | 15 |
Thank you for your help
You could use something like:
data set1;
input var0$ var1 var2 var3$;
datalines;
m 1 10 a
t 12 5 a
w 3 6 a
m 4 14 b
t 7 10 b
w 4 5 b
m 5 15 c
t 6 16 c
w 7 6 c
;
data set2;
input var4 var5 var3$;
datalines;
2 15 a
3 45 a
4 52 a
10 11 a
94 10 a
15 19 a
3 25 b
1 10 b
3 17 b
15 18 b
11 12 b
80 70 b
4 20 c
5 30 c
4 11 c
18 17 c
20 20 c
50 15 c
;
data set1;
set set1;
by var3;
if first.var3 then counter=1;
else counter+1;
run;
data set2;
set set2;
by var3;
if first.var3 then counter=1;
else counter+1;
run;
data want (drop=counter);
merge set1 set2;
by var3 counter;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: