Hello. I am trying to optimize by SAS code below. The code highlighted in red is causing the query to run (> 1 hr). Before adding the "OR" statement, the query tool about a few minutes to run. Please advise the best way to re-write this query to decrease the run time. Thank you.
proc sql;
create table meta_new as
select distinct m.*, scan(m.ChartName, -2, '._') as ProvID, left(m.ChartName,3) as Vendor_ChartName
from meta_data m
order by m.ChartName;
create table merged_data_hicn as
select distinct m.ChartName
from meta_new m, CDF c
where m.MemberID = c.DBHICN
and (m.vendorid = 20) or (m.vendorid = 9 and Vendor_ChartName = 'ABC')
and m.Provid = c.ProviderIDtxt
order by m.ChartName;
quit;
So its just the second query that needs rewriting, correct? How big are the two input data sets?
Yes I need some help with the 2nd query. The first dataset meta_new is about 2 million rows. The 2nd dataset CDF is about 362,000 rows. Thank you.
Are you getting the right result set from your second query? You may need to add extra brackets to get your OR condition working correctly:
and ((m.vendorid = 20) or (m.vendorid = 9 and Vendor_ChartName = 'ABC'))
Thank you very much for your response. Your answer did work and I will mark it as a solution. Thank you everyone for your help.
It's difficult to rewrite your code without knowing your data. Can you provide a sample of your data to work with?
I have attached the sample cdf file for your review. Thank you.
Here is the sample meta_new for you to review
SAS does not use indexes when a OR operator is present (unless the table is accessed through the SPDE engine).
Is your table indexed?
You could try 2 queries linked by the UNION operator instead.
And you could turn on
option msglevel=i;
to see if indexes are used.
For example:
data CLASS(index=(AGE));
set SASHELP.CLASS;
run;
option msglevel=i;
proc sql;
create table T as
select * from CLASS
where AGE in(11,12);
create table T as
select * from CLASS
where AGE=12 or (AGE=11 & NAME='Alice');
create table T as
select * from CLASS where AGE=12
union
select * from CLASS where AGE=11 & NAME='Alice';
quit;
32 create table T as
33 select * from CLASS
34 where AGE in(11,12);
INFO: Index Age selected for WHERE clause optimization.
NOTE: Compressing data set WORK.T increased size by 33.33 percent.
Compressed is 4 pages; un-compressed would require 3 pages.
35
36 create table T as
37 select * from CLASS
38 where AGE=12 or (AGE=11 & NAME='Alice');
NOTE: Compressing data set WORK.T increased size by 50.00 percent.
Compressed is 3 pages; un-compressed would require 2 pages.
39
40 create table T as
41 select * from CLASS where AGE=12
42 union
43 select * from CLASS where AGE=11 & NAME='Alice';
INFO: Index Age selected for WHERE clause optimization.
INFO: Index Age selected for WHERE clause optimization.
NOTE: Compressing data set WORK.T increased size by 50.00 percent.
Compressed is 3 pages; un-compressed would require 2 pages.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.