Hi all SAS Users,
I want todo left join the files industry_mtbv, industry_return, herfindahl_final ( these three file have same order, same two columns INDC3 and YEAR ) to the file winsorize.
My novice code is as below:
/*******Match winso and industry_return****************/
libname winso 'C:\Users\pnguyen\Desktop\winso' access=readonly;
proc copy inlib=winso outlib=work mt=data;
run;
data _wins1 / view=_wins1;
set work.winsorize;
_seqno1=_n_;
run;
proc sql;
create table matching1(drop=_seqno1) as
select a.*, industry_ret
from _wins1 a left join work.industry_return b
on a.indc3 = b.indc3 and a.year=b.year
order by _seqno1;
drop view _wins1;
quit;
/******Match matching1 and industry_mtbv************/
data _wins2 / view=_wins2;
set work.matching1;
_seqno2=_n_;
run;
proc sql;
create table matching2(drop=_seqno2) as
select a.*, median_mtbv
from _wins2 a left join work.industry_mtbv b
on a.indc3 = b.indc3 and a.year=b.year
order by _seqno2;
drop view _wins2;
quit;
/******Match matching2 and herfindahl_final*******/
data _wins3 / view=_wins3;
set work.matching2;
_seqno3=_n_;
run;
proc sql;
create table matching3(drop=_seqno3) as
select a.*, Herfindahl
from _wins3 a left join work.herfindahl_final b
on a.indc3 = b.indc3 and a.year=b.year
order by _seqno3;
drop view _wins3;
quit;
I know it is long so is there any way to shorten this code?
Warmest regards.
Hi @Phil_NZ,
If INDC3 Year is a unique key in each of the datasets industry_return, industry_mtbv and herfindahl_final, it should be no problem to combine all three left joins in a single PROC SQL step:
data _wins / view=_wins;
set winsorize;
_seqno=_n_;
run;
proc sql;
create table matching(drop=_seqno) as
select a.*, industry_ret, median_mtbv, Herfindahl
from _wins a
left join industry_return b
on a.indc3=b.indc3 and a.year=b.year
left join industry_mtbv c
on a.indc3=c.indc3 and a.year=c.year
left join herfindahl_final d
on a.indc3=d.indc3 and a.year=d.year
order by _seqno;
drop view _wins;
quit;
If you want to keep the order of a dataset while adding information from another (doing a "lookup"), the hash object is the tool of choice.
Hi @Phil_NZ,
If INDC3 Year is a unique key in each of the datasets industry_return, industry_mtbv and herfindahl_final, it should be no problem to combine all three left joins in a single PROC SQL step:
data _wins / view=_wins;
set winsorize;
_seqno=_n_;
run;
proc sql;
create table matching(drop=_seqno) as
select a.*, industry_ret, median_mtbv, Herfindahl
from _wins a
left join industry_return b
on a.indc3=b.indc3 and a.year=b.year
left join industry_mtbv c
on a.indc3=c.indc3 and a.year=c.year
left join herfindahl_final d
on a.indc3=d.indc3 and a.year=d.year
order by _seqno;
drop view _wins;
quit;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.