BookmarkSubscribeRSS Feed
q1234
Calcite | Level 5
Hi All,

I want to create a matched sample and while I was searching on the web, I came across a similar post by Paul
. So I copy Paul's post which exactly explains what I want to do, so please if you can help with that I will be so grateful for you.

“--------hello,
Since two weeks I am trying unsuccessfuly to solve the following
problem. I have a sample of event firms (small sample) and non-event
firms (big sample). I need to match each event firm with non-event firm
that is similar on -/+30% size and closest another ratio (BEME).
However, I want each control firm to be used only once (unless it is
used another time five years after its first use).
I found the following code, which I tried to implement in iterative way
but it is not working.

proc sql;
create table SEO_matches as
select eventfirm.*,
match.id as match_id,
match.ME as match_ME,
match.beme as match_beme,
abs(eventfirm.bemv - match.beme) as bmabsdiff
from eventfirm
join noneventfirms as match
on (eventfirm.OfferYear1 - 1 ) = match.year
where 0.7 * eventfirm.me < match.me < 1.3 * eventfirm.me
and eventfirm.id^= match.idand not missing(match.beme) and not
missing(match.me) group by eventfirm.id
having bmabsdiff = min(bmabsdiff) ;

data SEO_matches; set SEO_matches;
by id;
if first.id;
run;
-------The end of Paul post”
To see Paul post open this link
http://www.rhinocerus.net/forum/soft-sys-sas/502636-re-matching-event-sample-control-sample-but-usin...

Many thanks
13 REPLIES 13
Ksharp
Super User
What does your data look like?
And what do you want output to look like??



Ksharp
q1234
Calcite | Level 5
Hi Ksharp,
Many thanks for your replay. Actually, I have two groups of companies; one is the event companies about 500 and the second is the control companies around 2000. I need to match each company from the event to one company from the control, where both of them should have the same year and industry, and the size for the event companies should between 130% and 70% of the match companies’ size. After creating this matched portfolio, then each company from the event should match the closet BEME ratio from the matched portfolio.
For example
The event firms
Firm Year industry size (mill) BEME(ratio)
x 2000 20 100 1.1
y 1999 25 250 .9

The Match
Firm Year industry size (mill) BEME(ratio)
A 2000 20 100 1.1
J 1999 25 250 .9
K 2000 20 100 1.1
L 1999 25 250 .9
T 2000 20 100 1.1
R 1999 25 250 .9
................................................
Ksharp
Super User
Ok.
If it is what you want:


[pre]
data event;
input firm $ year industry size ratio;
datalines;
x 2000 20 100 1.1
y 1999 25 250 .9
;
run;
data match;
input _firm $ _year _industry _size _ratio;
datalines;
A 2000 20 100 1.2
J 1999 25 250 .4
K 2000 20 100 1.4
L 1999 25 250 .9
T 2000 20 100 1.8
R 1999 25 250 .2
;
run;

proc sql ;
create table portfolio1 as
select *
from event,match
where year=_year and industry=_industry and
(size between .7*_size and 1.3*_size)
;

create table portfolio2 as
select *,min(abs(ratio - _ratio)) as closest
from portfolio1
group by firm
having abs(ratio - _ratio) eq calculated closest
order by firm
;


quit;
[/pre]


Ksharp
q1234
Calcite | Level 5
Hi Ksharp,

Many thanks for your help, the code is perfect and I can find the matched protfolio easly now.

your help and time are highly appreciated.
q1234
Calcite | Level 5
Hi Ksharp,
Can you help with this? I used to use this code to remove less than 10 observations for each year and industry and it works. But I need to create dataset for each year and the run it.
For example if I want to run the code from 2000 to 2005 then I should have dataset for each year then I run the code for each dataset.

the dataset
Firm industry year
x 22 2001
z 33 2002
v 33 2002
b 14 2005
w 22 2001
x 22 2002
x 22 2003


The code

proc sort data=data;
by industry;
run;

proc tabulate data=dataset out= dataset_1;
class industry;
table industry*n;
run;

data dataset_1;
set dataset_1;
drop _table_;
drop _page_;
drop _type_;
run;


proc sort data= dataset;
by industry;
run;

proc sort data= dataset_1;
by industry;
run;

data dataset_2;
merge dataset dataset_1;
by industry;
run;

data dataset_3;
set dataset_2;
if n < 10 then delete;
run;
--------------------------------------------------------------
So, can you help me to modify this code to run it for pooled sample including all the years?

Thanks
Ksharp
Super User
OK.
That would not be too complicated!


[pre]
data dataset;
input Firm $ industry year;
datalines;
x 22 2001
z 33 2002
v 33 2002
b 14 2005
w 22 2001
x 22 2002
x 22 2003
x 22 2001
z 33 2002
v 33 2002
b 14 2005
w 22 2001
x 22 2002
x 22 2003
x 22 2001
z 33 2002
v 33 2002
b 14 2005
w 22 2001
x 22 2002
x 22 2003
x 22 2001
z 33 2002
v 33 2002
b 14 2005
w 22 2001
x 22 2002
x 22 2003
x 22 2001
z 33 2002
v 33 2002
b 14 2005
w 22 2001
x 22 2002
x 22 2003
x 22 2001
z 33 2002
v 33 2002
b 14 2005
w 22 2001
x 22 2002
x 22 2003
x 22 2001
z 33 2002
v 33 2002
b 14 2005
w 22 2001
x 22 2002
x 22 2003
;
run;

proc sql;
create table temp as
select *,count(*) as n
from dataset
group by year,industry
having calculated n ge 10
order by year,industry
;
quit;
[/pre]




Ksharp
q1234
Calcite | Level 5
Hi,

After I did the match I found that one firms from the matched portfolio matched with more than one frms from the even firmst. For example, the firms k match with many firms. So can you help me to modify the code that I do not have the same firms from the matched portfolio match with many frim from the event firsm
the event firms the matched portfolio
x k
z c
b k
g k
v n

The code
a: the event firms
b: the firms which have the matched

proc sql
; create table rem.portfolio1
as select * from a ,b
where ipo_year=_year and sic=_sic and
(roa between .9*_roa and 1.1*_roa)

; create table rem.portfolio2
as select *,min(abs(roa - _roa)) as closest from rem.portfolio1
group by firms_id
having abs(roa - _roa) eq calculated closest
order by firms_id ;
quit;

Many thanks
Ksharp
Super User
Hi.
You found "one firms from the matched portfolio matched with more than one frms from the even firmst" ,
that is because k firms has the most close distance with several firms from event firms,
I do not know which k firms you do not need, Do you have a criteria?
[pre]
x k <---------- you want this k company ?
z c
b k <---------- you want this k company ?
g k <---------- you want this k company ?
v n
[/pre]



Without logic to keep which k firm, It is hard to recommend.



Ksharp
Ksharp
Super User
Hi.
I think it again.
Maybe you can exchange the position of table a and table b.


[pre]
select * from b ,a
[/pre]


But, I do not know the result is whether what you want.


Ksharp
q1234
Calcite | Level 5
Hi,

I have changed the tables' position, but that does not make any difference on the output. Actually, what I need exactly as to use the firms K just for one time and does not matter to which one it will be matched. So it is like to write code to restrict the use of each firms in the match portfolio just for one time.

Many thanks
Ksharp
Super User
Hi.
I think you have duplicate obs. try these.
Can you post some example code and data?

You said:
>I need to match each company from the event to one company from the control, where both of them should have the same year and industry, and the size for the event companies should between 130% and 70% of the match companies’ size.


That means k firm from match is possible to match multiple firm from event.Best to post some example code and data , and explain which k firm you do not want.


[pre]
data event;
input firm $ year industry size ratio;
datalines;
x 2000 20 100 1.1
y 1999 25 250 .9
;
run;
data match;
input _firm $ _year _industry _size _ratio;
datalines;
A 2000 20 100 1.2
A 2000 20 100 1.2
J 1999 25 250 .4
K 2000 20 100 1.4
L 1999 25 250 .9
T 2000 20 100 1.8
R 1999 25 250 .2
;
run;

proc sql ;
create table portfolio1 as
select *
from event,match
where year=_year and industry=_industry and
(size between .7*_size and 1.3*_size)
;

create table portfolio2 as
select distinct *,min(abs(ratio - _ratio)) as closest
from portfolio1
group by firm
having abs(ratio - _ratio) eq calculated closest
order by firm
;


quit;
[/pre]


Ksharp Message was edited by: Ksharp
Ksharp
Super User
Hi.
I think maybe I understand what you mean.
You need to exchange the position of these two tables,
and change the group by variable.

[pre]
proc sql ;
create table portfolio1 as
select * from match,event
where year=_year and industry=_industry and
(size between .7*_size and 1.3*_size) ;

create table portfolio2 as
select *,min(abs(ratio - _ratio)) as closest
from portfolio1
group by _firm
having abs(ratio - _ratio) eq calculated closest
order by _firm ; quit;

[/pre]




Ksharp

Message was edited by: Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 7897 views
  • 2 likes
  • 2 in conversation