I have 2 datasets test firms and control firms. I need to match these two datasets based on same fiscal year and same sic code and the minimum difference in their size
Here is my SAS code:
PROC SQL;
create table match as
select a.* , b.* , abs(a.size - b.size) as diff
from control a, test b
where a. fyear = b. fyear and a. sic = b. sic
group by b.fyear ,b.sic, b.cik
having diff=min(diff);
PROC print data=match;
run;
The problem is that my SAS code works like matching with replacement. I need matching without replacement. So once SAS program finds a matched firm, it cannot use the same firm to be matched another test firm. (in the same fiscal year & same sic code)
Any comments or help would be appreciated. Thank you
Best,
Steve
You can't do it via SQL. use Hash Table .
data test;
set MERGEDATA_5(keep=companies fyr sic1 sizes);
run;
data control;
set MERGEDATA_6(keep=companies fyr sic1 sizes
rename=(companies=_companies fyr=fyr sic1=sic1 sizes =_sizes));
k+1;
run;
proc sort data=test; by fyr sic1 sizes;run;
data want;
if _n_ eq 1 then do;
if 0 then set control;
declare hash h(dataset:'control',multidata:'y');
h.definekey('fyr','sic1');
h.definedata('_companies','_sizes','k');
h.definedone();
end;
call missing(of _all_);
set test;
n=0;
min=9999999;
rc=h.find();
do while(rc=0);
abs=abs(sizes-_sizes);
if abs lt min then do;n=k;min=abs;end;
rc=h.find_next();
end;
rc=h.find();
do while(rc=0);
if n=k then do;h.removedup();leave;end;
rc=h.find_next();
end;
drop rc n min abs k;
run;
Xia Keshan
Message was edited by: xia keshan Fix a problem
You missed a keyword CALCULATED .
PROC SQL;
create table match as
select a.* , b.* , abs(a.size - b.size) as diff
from control a, test b
where a. fyear = b. fyear and a. sic = b. sic
group by b.fyear ,b.sic, b.cik
having calculated diff=min(calculated diff);
run;
and list all the variable you need in SELECT, not like ; a.*,b.* . You will got ERROR info if a and b have the same variable name.
Xia Keshan
I really appreciate your answer. Unfortunately, your suggestion did not work for all the cases. I think it's still working as 'matching with replacement' I mentioned above. Could you give me another way to solve this problem?
Best,
Steve
Matching with replacement / Matching without replacement
Test firm Matching firm
sic year
A 22 2009 Size=$42billion C 22 2009 Size=41billion / C 22 2009 Size=41billion
B 22 2009 Size=$40billion C 22 2009 Size=41billion / D 22 2009 Size=41.5 billion
To create a matched sample, I need to find a matched firm for 'A' and another for 'B'. If I'm matching on specific year (2009), I need to find with closest to 50billion for 'A' from the same sic code in 2009. Again the same for 'B'
I give an specific example of 'Matching with replacement' and 'Matching without replacement'. My desire output is 'Matching without replacement'
This means that once SAS program finds a matched firm, it cannot use the same firm to be matched another test firm. (in the same fiscal year & same sic code).
PROC SQL;
create table matching1 as
select a.*, b.*, abs(a.size - b.sizes) as diff
from mergedata_6 a, mergedata_5 b
where a. fyear = b. fyr and a. sic = b. sic1
group by b.fyr ,b.sic1, b.ciks
having calculated diff=min(calculated diff);
PROC print data=matching1;
run;
However, my SAS code just works as 'Matching with replacement'. Please give some help for solving this problem.
Thank you
Best,
Steve
You can't do it via SQL. use Hash Table .
data test;
set MERGEDATA_5(keep=companies fyr sic1 sizes);
run;
data control;
set MERGEDATA_6(keep=companies fyr sic1 sizes
rename=(companies=_companies fyr=fyr sic1=sic1 sizes =_sizes));
k+1;
run;
proc sort data=test; by fyr sic1 sizes;run;
data want;
if _n_ eq 1 then do;
if 0 then set control;
declare hash h(dataset:'control',multidata:'y');
h.definekey('fyr','sic1');
h.definedata('_companies','_sizes','k');
h.definedone();
end;
call missing(of _all_);
set test;
n=0;
min=9999999;
rc=h.find();
do while(rc=0);
abs=abs(sizes-_sizes);
if abs lt min then do;n=k;min=abs;end;
rc=h.find_next();
end;
rc=h.find();
do while(rc=0);
if n=k then do;h.removedup();leave;end;
rc=h.find_next();
end;
drop rc n min abs k;
run;
Xia Keshan
Message was edited by: xia keshan Fix a problem
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.