I would be strongly tempted to replace code like
left join hmaodm.actv_fact af on cdf.case_id = af.case_id and af.cse_src_sys_cd in ('CPM')
with something like
left join (select * from hmaodm.actv_fact where cse_src_sys_cd in ('CPM')) af on cdf.case_id = af.case_id
to reduce the number of records brought into the join. You have opportunities for this at many of your joins
Doesn't the SQL parser subset the right table when it sees
and af.cse_src_sys_cd in ('CPM')
?
Both SQL steps take 4.8s on my machine.
data A(sortedby=I) B(sortedby=I);
do I=1 to 1e7;
output;
end;
run;
proc sql _method;
create table T as
select A.I, b.I as J
from A left join B on a.I=b.I and b.I=1e7;
quit;
proc sql _method;
create table T as
select A.I, b.I as J
from A left join B(where=(I=1e7)) on a.I=b.I ;
quit;
@ChrisNZ wrote:
Both SQL steps take 4.8s on my machine.
data A(sortedby=I) B(sortedby=I); do I=1 to 1e7; output; end; run; proc sql _method; create table T as select A.I, b.I as J from A left join B on a.I=b.I and b.I=1e7; quit; proc sql _method; create table T as select A.I, b.I as J from A left join B(where=(I=1e7)) on a.I=b.I ; quit;
Cached data sets?
I haven't tested the suggestion in a while but when I had some data across network drives sub-setting the data had some positive impact in my environment.
Interesting. Something to keep one's eyes on then.
It'd be disappointing if the SQL optimiser did not subset the table. That's such an obvious step.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.