- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
data a;
input ID x;
datalines;
1 a
1 b
3 c
run;
data b;
input ID y;
datalines;
1 a
1 b
1 c
run;
data c;
merge a b;
by ID;
run;
The result is:
obs ID x y
1 1 a a
2 1 b b
3 1 b c
4 3 c
My question is, is there a way to produce an output as follows:
obs ID x y
1 1 a a
2 1 b b
3 1 c
4 3 c
Thanks
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The reason a one to many merge works in the SAS data step is because variables coming from input datasets are RETAINED. To prevent that you just need to add your own code to clear them. That is very easy to do using CALL MISSING(). Just make sure to write the observation BEFORE you clear the values.
data c;
merge a b;
by ID;
output;
call missing(of _all_);
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The reason a one to many merge works in the SAS data step is because variables coming from input datasets are RETAINED. To prevent that you just need to add your own code to clear them. That is very easy to do using CALL MISSING(). Just make sure to write the observation BEFORE you clear the values.
data c;
merge a b;
by ID;
output;
call missing(of _all_);
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I normally prefer SQL for such cases. I feel it's "cleaner".
data a;
input ID x $;
datalines;
1 a
1 b
3 c
;
run;
data b;
input ID y $;
datalines;
1 a
1 b
1 c
;
run;
proc sql;
/* create table want as */
select
coalesce(a.id,b.id) as id,
a.x,
b.y
from a
full join b
on a.id=b.id and a.x=b.y
;
quit;
The relationship between your tables with the keys used is actually not many:many but 1:1 (well: 1 or zero : 1 or zero).
For many:many joins the result (number of rows) between a SQL join and a data step merge will differ (most of the time you're after the SQL result).