Hi all SAS Users,
I want todo left join the files industry_mtbv, industry_return, herfindahl_final ( these three file have same order, same two columns INDC3 and YEAR ) to the file winsorize.
My novice code is as below:
/*******Match winso and industry_return****************/
libname winso 'C:\Users\pnguyen\Desktop\winso' access=readonly;
proc copy inlib=winso outlib=work mt=data;
run;
data _wins1 / view=_wins1;
set work.winsorize;
_seqno1=_n_;
run;
proc sql;
create table matching1(drop=_seqno1) as
select a.*, industry_ret
from _wins1 a left join work.industry_return b
on a.indc3 = b.indc3 and a.year=b.year
order by _seqno1;
drop view _wins1;
quit;
/******Match matching1 and industry_mtbv************/
data _wins2 / view=_wins2;
set work.matching1;
_seqno2=_n_;
run;
proc sql;
create table matching2(drop=_seqno2) as
select a.*, median_mtbv
from _wins2 a left join work.industry_mtbv b
on a.indc3 = b.indc3 and a.year=b.year
order by _seqno2;
drop view _wins2;
quit;
/******Match matching2 and herfindahl_final*******/
data _wins3 / view=_wins3;
set work.matching2;
_seqno3=_n_;
run;
proc sql;
create table matching3(drop=_seqno3) as
select a.*, Herfindahl
from _wins3 a left join work.herfindahl_final b
on a.indc3 = b.indc3 and a.year=b.year
order by _seqno3;
drop view _wins3;
quit;
I know it is long so is there any way to shorten this code?
Warmest regards.
Hi @Phil_NZ,
If INDC3 Year is a unique key in each of the datasets industry_return, industry_mtbv and herfindahl_final, it should be no problem to combine all three left joins in a single PROC SQL step:
data _wins / view=_wins;
set winsorize;
_seqno=_n_;
run;
proc sql;
create table matching(drop=_seqno) as
select a.*, industry_ret, median_mtbv, Herfindahl
from _wins a
left join industry_return b
on a.indc3=b.indc3 and a.year=b.year
left join industry_mtbv c
on a.indc3=c.indc3 and a.year=c.year
left join herfindahl_final d
on a.indc3=d.indc3 and a.year=d.year
order by _seqno;
drop view _wins;
quit;
If you want to keep the order of a dataset while adding information from another (doing a "lookup"), the hash object is the tool of choice.
Hi @Phil_NZ,
If INDC3 Year is a unique key in each of the datasets industry_return, industry_mtbv and herfindahl_final, it should be no problem to combine all three left joins in a single PROC SQL step:
data _wins / view=_wins;
set winsorize;
_seqno=_n_;
run;
proc sql;
create table matching(drop=_seqno) as
select a.*, industry_ret, median_mtbv, Herfindahl
from _wins a
left join industry_return b
on a.indc3=b.indc3 and a.year=b.year
left join industry_mtbv c
on a.indc3=c.indc3 and a.year=c.year
left join herfindahl_final d
on a.indc3=d.indc3 and a.year=d.year
order by _seqno;
drop view _wins;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.