merge help

Reply
Frequent Contributor
Posts: 102

merge help

hi, i have 2 datasets.

h1111: 11960 obs

h82: 13307 obs.

data h111; set h11; keep gvkey pfoyr cusip mainseg; run;

proc sql; create table h1111 as select distinct * from h111; quit; *11960;

data h81; set h8; keep pfoyr cusip; run; *13307;

proc sql; create table h82 as select distinct * from h81; quit;

proc sort data=h1111; by cusip pfoyr; run;

proc sort data=h82; by cusip pfoyr; run;

data j1; merge h82(in=fromh82x) h1111 (in=fromh1111x); by cusip pfoyr; fromh1111=fromh1111x; fromh82=fromh82x;

run;

I should be getting 13307 obs but i get 15839 obs.

could u please help????

thank you!!!!

Super Contributor
Posts: 490

Re: merge help

Why you should get 13307 obs?

Did you make sure that all {cusip, pfoyr} pairs in h82 have at most one match in h111?

Are you sure that there is no {cusip, pfoyr} pair in h111 not listed in h82?

Unless you make sure from that, your code does not grantee to have 13307 obs.


But if all what you are doing is to look up the value of h82 in h111, and this what i think you are trying to do.

Then your code should be like:


Data j1;

merge h82(in=fromh82x) h1111 (in=fromh1111x);

by cusip pfoyr;

if fromh82x;

run;



Esteemed Advisor
Posts: 6,685

Re: merge help

You have not set a condition that keeps only matching keys in the output.

Think of this:

Dataset A has 3 records with key values A,B,C

Dataset B has 3 records with key values D,E,F

Merging those by key will give you 6 records in the resulting data set (A,B,C,D,E,F)

Or:

A has A,B,C,D,E,F,G,H

B has A,A,A,A,A,A,A

Merging those by key will result in 14 records (7*A,B,C,D,E,F,G,H)

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Esteemed Advisor
Esteemed Advisor
Posts: 7,232

Re: merge help

Sorry, I cannot read that code.  Please use indentations, new lines etc.  If you expect 13307 records then your last step should presumably be:

proc sql;

     create table J1 as

     select     A.*,

                   B. the additional variables here

     from       H82 A

     left join   H111 B

     on           A.CUSIP=B.CUSIP

     and        A.PFOYR=B.PFOYR;

quit;

Ask a Question
Discussion stats
  • 3 replies
  • 224 views
  • 0 likes
  • 4 in conversation