DATA Step, Macro, Functions and more

Matching one to many.

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

Matching one to many.

Dear Members: 

I am new to research as well as to SAS. I am having the following problem. I highly appreciate your attempts. [ I have basic ideas about base sas and proc sql].

 

I have two datasets: 

Sample Dataset:

Company   SIC  ROE      TA

AA              200  2%       500

BB              200  5%        600

CC             300  7%         400

DD             400   8.5%     300

EE              400   9.5%     345

.

.

.

.

.

 

Control dataset:

Company   SIC  ROE       TA

XX              200  3%          550

YY              200  5.4%        680

ZZ             300  7.5%         500

PP             400   6.5%        600

SS              400   5.5%       745

.

.

..

.

.

 

I need at least two companies from control dataset for every single sample company from sample dataset. Please let me know sample codes. The codes my be in proc sql or base sas format. Thank you.  


Accepted Solutions
Solution
‎09-16-2016 10:38 PM
Super User
Posts: 19,877

Re: Matching one to many.

Untested, will do all matches and you'll need to filter them down.

 

proc sql;
create table want as
select a.*, b.company_id as control_id, b.roe as control_roe, b.ta as control_ta
from table1 as a
left join table2 as b
on a.sic=b.sic
and (a.roe-b.roe)/a.roe between -0.2 and 0.2
and (a.ta - b.ta)/a.ta between -.02 and 0.2; /*0.2 is 20%*/
quit;

View solution in original post


All Replies
Occasional Contributor
Posts: 7

One to many matching--Having two companies for each sample company.

Dear Members: 

I am new to research as well as to SAS. I am having the following problem. I highly appreciate your attempts. [ I have basic ideas about base sas and proc sql].

 

I have two datasets: 

Sample Dataset:

Company   SIC  ROE      TA

AA              200  2%       500

BB              200  5%        600

CC             300  7%         400

DD             400   8.5%     300

EE              400   9.5%     345

.

.

.

.

.

 

Control dataset:

Company   SIC  ROE       TA

XX              200  3%          550

YY              200  5.4%        680

ZZ             300  7.5%         500

PP             400   6.5%        600

SS              400   5.5%       745

.

.

..

.

.

 

I need at least two companies from control dataset for every single sample company from sample dataset. Please let me know sample codes. The codes my be in proc sql or base sas format. Thank you.  

Super User
Posts: 10,046

Re: One to many matching--Having two companies for each sample company.

Can you post the output you want see ?
And what do you mean by "I need at least two companies from control dataset " ?
Is there any rules to pick up these at least two companies ?
Super User
Posts: 19,877

Re: Matching one to many.

What's your criteria for 'matching'?

 

Occasional Contributor
Posts: 7

Re: Matching one to many.

My matching should be on the basis of SIC, ROE, and TA. TA and ROE maybe within a range, say 25% or 50%.

Super User
Posts: 19,877

Re: Matching one to many.

And can a company be matched multiple times? You really need to specify your criteria more clearly. 
If you search for propensity matching you'll find algorithm and code examples online.

Occasional Contributor
Posts: 7

Re: Matching one to many.

Thank you for your reply. 

 

I need at least two companies for every sample company, and I want to do that on the basis of ROE, TA, and SIC. 

 

say for example, for company AA, I need company XX and YY. 

 

Thank you for your inputs. 

Solution
‎09-16-2016 10:38 PM
Super User
Posts: 19,877

Re: Matching one to many.

Untested, will do all matches and you'll need to filter them down.

 

proc sql;
create table want as
select a.*, b.company_id as control_id, b.roe as control_roe, b.ta as control_ta
from table1 as a
left join table2 as b
on a.sic=b.sic
and (a.roe-b.roe)/a.roe between -0.2 and 0.2
and (a.ta - b.ta)/a.ta between -.02 and 0.2; /*0.2 is 20%*/
quit;
Occasional Contributor
Posts: 7

Re: Matching one to many.

 

data sample;
input company$ sic roe ta;
datalines;
AA 100 5.5 100
BB 100 5.6 200
CC 200 6.5 300
DD 400 5.8 300
EE 400 8.9 500
;
run;
proc print data=sample;
run;

data control;
input company$ sic roe ta;
datalines;
XX 100 5.4 100
YY 100 5.9 200
DE 100 5.6 200
RR 200 6.4 300
RE 200 7.5 400
RT 300 6.5 450
TT 400 5.9 300
TR 400 9.8 500
;
run;
proc print data=Control;
run;

proc sql;
create table want as
select a.*, b.company as control, b.roe as control_roe, b.ta as control_ta
from sample as a
left join control as b
on a.sic=b.sic
and (a.roe-b.roe)/a.roe between -0.5 and 0.7
and (a.ta - b.ta)/a.ta between -.02 and 0.2; /*0.2 is 20%*/
quit;

proc print data=want;
run;

I have run the above code and found the following: 

The SAS System 


Obs company sic roe ta control control_roe control_ta 
1    AA     100 5.5 100  XX    5.4       100 
2    BB     100 5.6 200  YY    5.9       200 
3    CC     200 6.5 300  RR    6.4       300 
4    DD     400 5.8 300  TT    5.9       300 
5    EE     400 8.9 500  TR    9.8       500 

But I want, for example, for company BB, match companies will be YY and DE. That codes give me only YY for BB. How can I have both YY and DE for BB. 

Thank you for your efforts.

Super User
Posts: 19,877

Re: Matching one to many.

[ Edited ]

Must not be within the tolerance levels you set. What is the percent diff for your matching variables?

 

Edit: Yes, it's outside the tolerance levels set. Modify to your requirements. 

Occasional Contributor
Posts: 7

Re: Matching one to many.

Should I ignore the tolerance level such as 50% or 20%?

Super User
Posts: 19,877

Re: Matching one to many.


Khadem wrote:

Should I ignore the tolerance level such as 50% or 20%?


It's your specifications, not mine. I assume there's some rationale to your boundaries. 

The company in Q was -5%, the interval used was -2 to 20%. 

Occasional Contributor
Posts: 7

Re: Matching one to many.

Thank you. Let me try with some modifications. I will let you know. 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 646 views
  • 0 likes
  • 3 in conversation