09-11-2014 03:17 AM
I have 2 datasets, original_599 and balance_9981.(original_599 dataset has n=599 firms/ years and balance_9981 dataset has n=9981 firms/.years).
I need to find 599 firms from the 9981 firms to match with the original_599 dataset based on their industry (ffind) and size (size).
The different characteristics of original_599 is CEOturn=1 while balance_9981 CEOturn=0. (Ceoturn=Ceo turnover)
I wish to have 2 options:
a) those firms matched are from the same financial year (fyear)
b) no need to match same financial year (as they may not be able to find the suitable one).
09-11-2014 06:49 AM
First match on Fiscal Year.
The in the next loop match without Fiscal Year, but exclude the ones that already have a match...?
09-11-2014 07:38 AM
I have tried using this - for the same financial year. What is your view?
proc sort data=huang_599;
by gvkey fyear;
proc sort data=huang_bal;
by gvkey fyear;
create table control as
select O.*, A.gvkey as Anum, A.fyear as Afy, abs(O.size-A.size) as sizeDiff
from huang_bal as O inner join huang_599 as A
on O.fYear=A.fyear and O.ffind=A.ffind
where O.gvkey not in (select gvkey from huang_599)
order by gvkey;
proc sort data=control;
by anum afy;
/* Find the 1 set of 599 closest firms */
proc means data=control noprint;
by Anum afy;
output out=selected idgroup(min(sizeDiff) out (fyear gvkey)=);
create table selected1 as
select a.*, b.*
from selected a left join huang_bal b
on a.gvkey=b.gvkey and a.fyear=b.fYear
order by a.gvkey, a.fyear;
drop _freq_ _type_;
set huang_599 selected1;