HI,
here is the sample dataset which I want to subset into dataset A and B. In dataset B, keep PINs if all obs for session is blank. If one of the Obs in "Session" for a PIN is not blank then keep all the obs for that PIN in dataset B. Please suggest.
dataset | ||
ID | Level | Session |
1 | Z | |
2 | X | |
2 | Y | |
2 | Z | |
3 | Z | |
3 | X | |
4 | X | A |
4 | X | |
4 | Y | B |
dataset B | ||
ID | Level | Session |
1 | Z | |
2 | X | |
2 | Y | |
2 | Z | |
3 | Z | |
3 | X |
dataset A | ||
ID | Level | Session |
4 | X | A |
4 | X | |
4 | Y | B |
Merge a subset of non-blanks with the entire dataset:
dm 'clear log';
data have;
infile datalines missover;
input id Level :$1. Session :$1.;
datalines;
1 Z
2 X
2 Y
2 Z
3 Z
3 X
4 X A
4 X
4 Y B
run;
data a b;
merge have (where=(session^=' ') in=anynonblanks) have;
by id;
if anynonblanks then output b;
else output a;
run;
Merge a subset of non-blanks with the entire dataset:
dm 'clear log';
data have;
infile datalines missover;
input id Level :$1. Session :$1.;
datalines;
1 Z
2 X
2 Y
2 Z
3 Z
3 X
4 X A
4 X
4 Y B
run;
data a b;
merge have (where=(session^=' ') in=anynonblanks) have;
by id;
if anynonblanks then output b;
else output a;
run;
Nice one @mkeintz how did 4 X "' " get to be true nonblanks. Where am i missing the point?
There are two attributes of MERGE being utilized here:
In fact, if you merge more than two datasets, then the principle is extended: for any common variable in multiple datasets, the rightmost value prevails, assuming the variable is of the same type (numeric or character) for all datasets. If a variable is of both types, the merge fails. In the case of other attributes (length, label, format), they are inherited from the leftmost dataset, since that's the first encounter the sas compiler has with the variable.
values supersede
What does MERGE X Y; BY ID; do when an ID group has fewer observations in (say) X than in Y. It propagates the last observation in X to be associated with the "excess" observations in Y.
When an ID group is being MERGEd, and there are no an equal number of observations in the "left" and "right" datasets,
@mkeintz Thank you so much for explaining at length. I really appreciate the privilege of receiving your time and knowledge.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.