hi, i have 2 datasets.
h1111: 11960 obs
h82: 13307 obs.
data h111; set h11; keep gvkey pfoyr cusip mainseg; run;
proc sql; create table h1111 as select distinct * from h111; quit; *11960;
data h81; set h8; keep pfoyr cusip; run; *13307;
proc sql; create table h82 as select distinct * from h81; quit;
proc sort data=h1111; by cusip pfoyr; run;
proc sort data=h82; by cusip pfoyr; run;
data j1; merge h82(in=fromh82x) h1111 (in=fromh1111x); by cusip pfoyr; fromh1111=fromh1111x; fromh82=fromh82x;
run;
I should be getting 13307 obs but i get 15839 obs.
could u please help????
thank you!!!!
Why you should get 13307 obs?
Did you make sure that all {cusip, pfoyr} pairs in h82 have at most one match in h111?
Are you sure that there is no {cusip, pfoyr} pair in h111 not listed in h82?
Unless you make sure from that, your code does not grantee to have 13307 obs.
But if all what you are doing is to look up the value of h82 in h111, and this what i think you are trying to do.
Then your code should be like:
Data j1;
merge h82(in=fromh82x) h1111 (in=fromh1111x);
by cusip pfoyr;
if fromh82x;
run;
You have not set a condition that keeps only matching keys in the output.
Think of this:
Dataset A has 3 records with key values A,B,C
Dataset B has 3 records with key values D,E,F
Merging those by key will give you 6 records in the resulting data set (A,B,C,D,E,F)
Or:
A has A,B,C,D,E,F,G,H
B has A,A,A,A,A,A,A
Merging those by key will result in 14 records (7*A,B,C,D,E,F,G,H)
Sorry, I cannot read that code. Please use indentations, new lines etc. If you expect 13307 records then your last step should presumably be:
proc sql;
create table J1 as
select A.*,
B. the additional variables here
from H82 A
left join H111 B
on A.CUSIP=B.CUSIP
and A.PFOYR=B.PFOYR;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.