BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Khadem
Calcite | Level 5

Dear Members: 

I am new to research as well as to SAS. I am having the following problem. I highly appreciate your attempts. [ I have basic ideas about base sas and proc sql].

 

I have two datasets: 

Sample Dataset:

Company   SIC  ROE      TA

AA              200  2%       500

BB              200  5%        600

CC             300  7%         400

DD             400   8.5%     300

EE              400   9.5%     345

.

.

.

.

.

 

Control dataset:

Company   SIC  ROE       TA

XX              200  3%          550

YY              200  5.4%        680

ZZ             300  7.5%         500

PP             400   6.5%        600

SS              400   5.5%       745

.

.

..

.

.

 

I need at least two companies from control dataset for every single sample company from sample dataset. Please let me know sample codes. The codes my be in proc sql or base sas format. Thank you.  

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Untested, will do all matches and you'll need to filter them down.

 

proc sql;
create table want as
select a.*, b.company_id as control_id, b.roe as control_roe, b.ta as control_ta
from table1 as a
left join table2 as b
on a.sic=b.sic
and (a.roe-b.roe)/a.roe between -0.2 and 0.2
and (a.ta - b.ta)/a.ta between -.02 and 0.2; /*0.2 is 20%*/
quit;

View solution in original post

12 REPLIES 12
Khadem
Calcite | Level 5

Dear Members: 

I am new to research as well as to SAS. I am having the following problem. I highly appreciate your attempts. [ I have basic ideas about base sas and proc sql].

 

I have two datasets: 

Sample Dataset:

Company   SIC  ROE      TA

AA              200  2%       500

BB              200  5%        600

CC             300  7%         400

DD             400   8.5%     300

EE              400   9.5%     345

.

.

.

.

.

 

Control dataset:

Company   SIC  ROE       TA

XX              200  3%          550

YY              200  5.4%        680

ZZ             300  7.5%         500

PP             400   6.5%        600

SS              400   5.5%       745

.

.

..

.

.

 

I need at least two companies from control dataset for every single sample company from sample dataset. Please let me know sample codes. The codes my be in proc sql or base sas format. Thank you.  

Ksharp
Super User
Can you post the output you want see ?
And what do you mean by "I need at least two companies from control dataset " ?
Is there any rules to pick up these at least two companies ?
Reeza
Super User

What's your criteria for 'matching'?

 

Khadem
Calcite | Level 5

My matching should be on the basis of SIC, ROE, and TA. TA and ROE maybe within a range, say 25% or 50%.

Reeza
Super User

And can a company be matched multiple times? You really need to specify your criteria more clearly. 
If you search for propensity matching you'll find algorithm and code examples online.

Khadem
Calcite | Level 5

Thank you for your reply. 

 

I need at least two companies for every sample company, and I want to do that on the basis of ROE, TA, and SIC. 

 

say for example, for company AA, I need company XX and YY. 

 

Thank you for your inputs. 

Reeza
Super User

Untested, will do all matches and you'll need to filter them down.

 

proc sql;
create table want as
select a.*, b.company_id as control_id, b.roe as control_roe, b.ta as control_ta
from table1 as a
left join table2 as b
on a.sic=b.sic
and (a.roe-b.roe)/a.roe between -0.2 and 0.2
and (a.ta - b.ta)/a.ta between -.02 and 0.2; /*0.2 is 20%*/
quit;
Khadem
Calcite | Level 5

 

data sample;
input company$ sic roe ta;
datalines;
AA 100 5.5 100
BB 100 5.6 200
CC 200 6.5 300
DD 400 5.8 300
EE 400 8.9 500
;
run;
proc print data=sample;
run;

data control;
input company$ sic roe ta;
datalines;
XX 100 5.4 100
YY 100 5.9 200
DE 100 5.6 200
RR 200 6.4 300
RE 200 7.5 400
RT 300 6.5 450
TT 400 5.9 300
TR 400 9.8 500
;
run;
proc print data=Control;
run;

proc sql;
create table want as
select a.*, b.company as control, b.roe as control_roe, b.ta as control_ta
from sample as a
left join control as b
on a.sic=b.sic
and (a.roe-b.roe)/a.roe between -0.5 and 0.7
and (a.ta - b.ta)/a.ta between -.02 and 0.2; /*0.2 is 20%*/
quit;

proc print data=want;
run;

I have run the above code and found the following: 

The SAS System 


Obs company sic roe ta control control_roe control_ta 
1    AA     100 5.5 100  XX    5.4       100 
2    BB     100 5.6 200  YY    5.9       200 
3    CC     200 6.5 300  RR    6.4       300 
4    DD     400 5.8 300  TT    5.9       300 
5    EE     400 8.9 500  TR    9.8       500 

But I want, for example, for company BB, match companies will be YY and DE. That codes give me only YY for BB. How can I have both YY and DE for BB. 

Thank you for your efforts.

Reeza
Super User

Must not be within the tolerance levels you set. What is the percent diff for your matching variables?

 

Edit: Yes, it's outside the tolerance levels set. Modify to your requirements. 

Khadem
Calcite | Level 5

Should I ignore the tolerance level such as 50% or 20%?

Reeza
Super User

@Khadem wrote:

Should I ignore the tolerance level such as 50% or 20%?


It's your specifications, not mine. I assume there's some rationale to your boundaries. 

The company in Q was -5%, the interval used was -2 to 20%. 

Khadem
Calcite | Level 5

Thank you. Let me try with some modifications. I will let you know. 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 2723 views
  • 0 likes
  • 3 in conversation