Hi SAS Community,
I have two data files, cops and COL.
Data cops;
Input transit acctnum;
Cards;
1 111
1 222
1 333
;
Run;
Data COL;
Input transit acctnum;
Cards;
1 111
9 999
;
Run;
I need to identify what are the records in COL file that are not found in COPS file.
My approach:
I have done merging like below with the condition highlighted in green.
Proc sort data=cops out=t1;
by transit acctnum;
run;
proc sort data=col out=t2;
by transit acctnum;
run;
data t3 ;
merge t1 (in=a)
t2 (in=b)
;
by Transit acctnum;
if b and not a;
run;
This is the answer which is correct.
1 | 9 | 999 |
Question:
I have tweaked the merging condition as shown in yellow. This still gives the correct answer.
Could any merging expert confirm whether green and yellow conditions both essentially do the same.
I am asking this because the answer may be “yes” for this small data set but when I apply to my larger dataset, SAS may do
something else which I cannot detect.
data t4 ;
merge t1 (in=a)
t2 (in=b)
;
by Transit acctnum;
if not a ;
run;
Thank you for your help
Mirisage
It's the same as long as you have 2 datasets your merging. If you have 3 or more it won't be.
It's the same as long as you have 2 datasets your merging. If you have 3 or more it won't be.
If you just think about it as a logic problem then only values that would by in (B and NOT A) and not in (NOT A) are those with NOT B and NOT A. But that is impossible as the data has to come from somewhere.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.