I have a full data set (All), and 3 sub data sets (A, B, C).
How to select the cases that are in All, but not in any of A, or B, or C?
I can merge A, B, C into a new dataset and then compare All and the newly merged dataset, but I wonder if there is a simpler way to do it?
Thanks
You can include all four datasets in one merge step:
data want;
merge All
A(in=a)
B(in=b)
C(in=c);
by your by_variables;
if a+b+c=0;
run;
You can include all four datasets in one merge step:
data want;
merge All
A(in=a)
B(in=b)
C(in=c);
by your by_variables;
if a+b+c=0;
run;
The suggested code assumes that all four datasets have been sorted by the common BY variables (whose names would, of course, replace the text "your by_variables" in the code).
a, b, c are only temporary indicator variables which are 1 or 0, depending on whether a combination of BY variable values occurs in the respective dataset. You can choose other names for them (e.g. in_a, in_b, in_c). You should definitely choose other names if variables of the same names were present in any of your four datasets, so that a name conflict would occur otherwise.
For three numbers a, b, c which are either 1 or 0, the criterion "a+b+c=0" is equivalent to "a=0 & b=0 & c=0", so you could alternatively write the latter into the IF condition. Another equivalent condition would be: not (a | b | c).
In any case, the purpose of this subsetting IF statement is to restrict the output dataset WANT to observations which meet your selection criterion: observations (whose BY value combinations are) not found in A, nor in B, nor in C. The criterion that they are in dataset All would be redundant in this data step, because if an observation does not come from A, B or C, it must come from the only remaining dataset involved: All.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.