Hi,
I am trying to match each sample firm I have with a set of 5 control firms based on three main variables:
1. Date - The test firms must have data on the date that the the sample firms specify
2. Market Capitalisation - control firms must have a market cap between 70% and 130% of the market cap of the sample firm
3. BM ratio - of the firms that satisfy the above two conditions, I want to choose the 5 closest control firms based on the absolute value of the difference in the BM ratios between the sample firm and the control firms.
For example:
Sample firms:
firm date marketcap bmratio
A 31/01/2015 500 0.5
B 30/06/2018 400 0.7
Control firms:
firm date marketcap bmratio
d 31/01/2015 550 0.5
e 31/01/2015 425 0.1
f 31/01/2015 900 0.6
g 30/06/2018 300 0.25
h 31/07/2019 500 0.9
...
...
Desired Output:
firm date marketcap bmratio matchedfirm matchedfirm_rank
A 31/01/2015 500 0.5 d 1
A 31/01/2015 500 0.5 e 2
A 31/01/2015 500 0.5 ... 3
A 31/01/2015 500 0.5 ... 4
A 31/01/2015 500 0.5 ... 5
B 30/06/2018 400 0.7 g 1
B 30/06/2018 400 0.7 ... 2
B 30/06/2018 400 0.7 ... 3
B 30/06/2018 400 0.7 ... 4
B 30/06/2018 400 0.7 ... 5
I don't need the market cap or bm ratio of the control firms, I just need to know what the best 5 matches are for each sample firm.
Any help would be really appreciated (This is following the matching firms approach in Barber and Lyon (1997) so if anyone knows of another thread that trys to implement this, please point me in that direction)
Like this?
proc sql;
create table JOIN as
select s.*, c.FIRM as MATCHED, abs(s.BMRATIO-c.BMRATIO) as DIFF
from SAMPLE s left join CONTROL c
on s.DATE=c.DATE
and .70 <= c.MARKETCAP/s.MARKETCAP <= 1.3
order by s.FIRM, DIFF;
quit;
data WANT ;
set JOIN;
by FIRM;
if first.FIRM then RANK=0;
RANK+1;
if RANK>5 then return;
else if last.FIRM and RANK<5 then do;
output;
do I= RANK+1 to 5;
MATCHED=' ';
RANK=I;
output;
end;
return;
end;
output;
keep FIRM MATCHED DATE BMRATIO MARKETCAP RANK;
run;
proc print; run;
Obs | FIRM | DATE | MARKETCAP | BMRATIO | MATCHED | RANK |
---|---|---|---|---|---|---|
1 | A | 31JAN2015 | 500 | 0.5 | d | 1 |
2 | A | 31JAN2015 | 500 | 0.5 | e | 2 |
3 | A | 31JAN2015 | 500 | 0.5 | 3 | |
4 | A | 31JAN2015 | 500 | 0.5 | 4 | |
5 | A | 31JAN2015 | 500 | 0.5 | 5 | |
6 | B | 30JUN2018 | 400 | 0.7 | g | 1 |
7 | B | 30JUN2018 | 400 | 0.7 | 2 | |
8 | B | 30JUN2018 | 400 | 0.7 | 3 | |
9 | B | 30JUN2018 | 400 | 0.7 | 4 | |
10 | B | 30JUN2018 | 400 | 0.7 | 5 |
Like this?
proc sql;
create table JOIN as
select s.*, c.FIRM as MATCHED, abs(s.BMRATIO-c.BMRATIO) as DIFF
from SAMPLE s left join CONTROL c
on s.DATE=c.DATE
and .70 <= c.MARKETCAP/s.MARKETCAP <= 1.3
order by s.FIRM, DIFF;
quit;
data WANT ;
set JOIN;
by FIRM;
if first.FIRM then RANK=0;
RANK+1;
if RANK>5 then return;
else if last.FIRM and RANK<5 then do;
output;
do I= RANK+1 to 5;
MATCHED=' ';
RANK=I;
output;
end;
return;
end;
output;
keep FIRM MATCHED DATE BMRATIO MARKETCAP RANK;
run;
proc print; run;
Obs | FIRM | DATE | MARKETCAP | BMRATIO | MATCHED | RANK |
---|---|---|---|---|---|---|
1 | A | 31JAN2015 | 500 | 0.5 | d | 1 |
2 | A | 31JAN2015 | 500 | 0.5 | e | 2 |
3 | A | 31JAN2015 | 500 | 0.5 | 3 | |
4 | A | 31JAN2015 | 500 | 0.5 | 4 | |
5 | A | 31JAN2015 | 500 | 0.5 | 5 | |
6 | B | 30JUN2018 | 400 | 0.7 | g | 1 |
7 | B | 30JUN2018 | 400 | 0.7 | 2 | |
8 | B | 30JUN2018 | 400 | 0.7 | 3 | |
9 | B | 30JUN2018 | 400 | 0.7 | 4 | |
10 | B | 30JUN2018 | 400 | 0.7 | 5 |
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.