hi,
Would anyone have a tip/suggestion on whether the following query could be optimized by nesting the queries together?
The below code gets the jobs done, but am looking to learn nested joins if that would be more efficient. The below essenstially runs IDs from table A against data in table data_table 4 times, each time seaching for the presence of a different status code. It is a left join as need to retain all IDs. If u have a suggestion, pls include explanation. Also, if there is any rule/best practices to follow for nesting joins, let me know,
thanks in advance
proc sql; create table b as select a.*, b.stat_cd|| b.stat_rsn_cd AS CL_STATUS, b.eff_from_dt AS CL_FRM_DT From a as a left join data_table as b on a.id=b.id and b.stat_cd='CL' ; quit; proc sql; create table C as select a.*, b.stat_cd|| b.stat_rsn_cd AS CO_STATUS, b.eff_from_dt AS CO_FRM_DT From B as a left join data_table as b on a.id=b.id and b.stat_cd='CO' ; quit; proc sql; create table D as select a.*, b.stat_cd|| b.stat_rsn_cd AS CR_STATUS, b.eff_from_dt AS CR_FRM_DT From C as a left join data_table as b on a.id=b.id and b.stat_cd='CR' ; quit; proc sql; create table END as select a.*, b.stat_cd|| b.stat_rsn_cd AS FR_STATUS, b.eff_from_dt AS FR_FRM_DT From D as a left join data_table as b on a.id=b.id and b.stat_cd='FR' ; quit;
Just use the same code but give each new "right" table aa distinct alias.
proc sql;
create table WANT as
select a.*
, b.stat_cd|| b.stat_rsn_cd AS CL_STATUS
, b.eff_from_dt AS CL_FRM_DT
, c.stat_cd|| c.stat_rsn_cd AS CO_STATUS
, c.eff_from_dt AS CO_FRM_DT
...
from a as a
left join data_table as b
on a.id=b.id and b.stat_cd='CL'
left join data_table as c
on a.id=c.id and c.stat_cd='CO'
...
;
quit;
Yes, without some data its hard to provide anything. What I would say however is that you appear to be doing exactly the same code four times, this does not seem optimal. Why do you need to split it up into four as all it looks like your doing is merging each set of variables back onto itself. This is where the art of transposing data would come in. With your data you should be able to transpose up the concatenated status variable and the result variable and then merge the two together, this would have the number of joins.
You could use sub-queries like this:
proc sql; create table WANT as select A.*, (select cats(STAT_CD,STAT_RSN_CD) from DATA_TABLE where A.ID=ID and STAT_CD='CL') as CL_STATUS, (select cats(STAT_CD,STAT_RSN_CD) from DATA_TABLE where A.ID=ID and STAT_CD="CO") as CO_STATUS, ... from A as A; quit;
It may read a bit simpler, but not sure its going to be more effective than two transposes and a merge. If you post test data in the form of a datastep we could look at some code tomorrow.
To: RW9
hi- thanks so much for the tip... i'll have to look into your suggestion of transposing up the concatenated status variable! I learn a lot from you, thank you
Just use the same code but give each new "right" table aa distinct alias.
proc sql;
create table WANT as
select a.*
, b.stat_cd|| b.stat_rsn_cd AS CL_STATUS
, b.eff_from_dt AS CL_FRM_DT
, c.stat_cd|| c.stat_rsn_cd AS CO_STATUS
, c.eff_from_dt AS CO_FRM_DT
...
from a as a
left join data_table as b
on a.id=b.id and b.stat_cd='CL'
left join data_table as c
on a.id=c.id and c.stat_cd='CO'
...
;
quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.