I have 2 datasets test firms and control firms. I need to match these two datasets based on same fiscal year and same sic code and the minimum difference in their size
Here is my SAS code:
PROC SQL;
create table match as
select a.* , b.* , abs(a.size - b.size) as diff
from control a, test b
where a. fyear = b. fyear and a. sic = b. sic
group by b.fyear ,b.sic, b.cik
having diff=min(diff);
PROC print data=match;
run;
The problem is that my SAS code works like matching with replacement. I need matching without replacement. So once SAS program finds a matched firm, it cannot use the same firm to be matched another test firm. (in the same fiscal year & same sic code)
Any comments or help would be appreciated. Thank you
Best,
Steve
You can't do it via SQL. use Hash Table .
data test;
set MERGEDATA_5(keep=companies fyr sic1 sizes);
run;
data control;
set MERGEDATA_6(keep=companies fyr sic1 sizes
rename=(companies=_companies fyr=fyr sic1=sic1 sizes =_sizes));
k+1;
run;
proc sort data=test; by fyr sic1 sizes;run;
data want;
if _n_ eq 1 then do;
if 0 then set control;
declare hash h(dataset:'control',multidata:'y');
h.definekey('fyr','sic1');
h.definedata('_companies','_sizes','k');
h.definedone();
end;
call missing(of _all_);
set test;
n=0;
min=9999999;
rc=h.find();
do while(rc=0);
abs=abs(sizes-_sizes);
if abs lt min then do;n=k;min=abs;end;
rc=h.find_next();
end;
rc=h.find();
do while(rc=0);
if n=k then do;h.removedup();leave;end;
rc=h.find_next();
end;
drop rc n min abs k;
run;
Xia Keshan
Message was edited by: xia keshan Fix a problem
You missed a keyword CALCULATED .
PROC SQL;
create table match as
select a.* , b.* , abs(a.size - b.size) as diff
from control a, test b
where a. fyear = b. fyear and a. sic = b. sic
group by b.fyear ,b.sic, b.cik
having calculated diff=min(calculated diff);
run;
and list all the variable you need in SELECT, not like ; a.*,b.* . You will got ERROR info if a and b have the same variable name.
Xia Keshan
I really appreciate your answer. Unfortunately, your suggestion did not work for all the cases. I think it's still working as 'matching with replacement' I mentioned above. Could you give me another way to solve this problem?
Best,
Steve
Matching with replacement / Matching without replacement
Test firm Matching firm
sic year
A 22 2009 Size=$42billion C 22 2009 Size=41billion / C 22 2009 Size=41billion
B 22 2009 Size=$40billion C 22 2009 Size=41billion / D 22 2009 Size=41.5 billion
To create a matched sample, I need to find a matched firm for 'A' and another for 'B'. If I'm matching on specific year (2009), I need to find with closest to 50billion for 'A' from the same sic code in 2009. Again the same for 'B'
I give an specific example of 'Matching with replacement' and 'Matching without replacement'. My desire output is 'Matching without replacement'
This means that once SAS program finds a matched firm, it cannot use the same firm to be matched another test firm. (in the same fiscal year & same sic code).
PROC SQL;
create table matching1 as
select a.*, b.*, abs(a.size - b.sizes) as diff
from mergedata_6 a, mergedata_5 b
where a. fyear = b. fyr and a. sic = b. sic1
group by b.fyr ,b.sic1, b.ciks
having calculated diff=min(calculated diff);
PROC print data=matching1;
run;
However, my SAS code just works as 'Matching with replacement'. Please give some help for solving this problem.
Thank you
Best,
Steve
You can't do it via SQL. use Hash Table .
data test;
set MERGEDATA_5(keep=companies fyr sic1 sizes);
run;
data control;
set MERGEDATA_6(keep=companies fyr sic1 sizes
rename=(companies=_companies fyr=fyr sic1=sic1 sizes =_sizes));
k+1;
run;
proc sort data=test; by fyr sic1 sizes;run;
data want;
if _n_ eq 1 then do;
if 0 then set control;
declare hash h(dataset:'control',multidata:'y');
h.definekey('fyr','sic1');
h.definedata('_companies','_sizes','k');
h.definedone();
end;
call missing(of _all_);
set test;
n=0;
min=9999999;
rc=h.find();
do while(rc=0);
abs=abs(sizes-_sizes);
if abs lt min then do;n=k;min=abs;end;
rc=h.find_next();
end;
rc=h.find();
do while(rc=0);
if n=k then do;h.removedup();leave;end;
rc=h.find_next();
end;
drop rc n min abs k;
run;
Xia Keshan
Message was edited by: xia keshan Fix a problem
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.