Hi,
I am currently using SAS Data step.
Data test1; /* loading approx 17 lakhs records */
Set source (keep=var1 var2 var3); /* source data set has 322596778 obs */
where DATEPART(START_from_dtm)="31JAN2020"D <=datepart(START_to_dtm);
run;
The above data step is running for 25 minutes. Is there any solution to execute the query faster.
START_from_dtm has index on it. So tried by keeping only START_from_dtm column but same time.
I am using the same query multiple times and the execution is increasing.
Thanks in advance.
Do your datetimes have a time, or is the time just 00:00:00?
Is your source table indexed or sorted?
How many observations do you have, and how many different dates?
effective_from_dtm has values like DDMONYYYY:00:00:00.000.
As I have observed most of the values has 00 as time.
Index, I have mentioned in my original post, effective_from_dtm has index and no other indexes.
I can't create Index on that table as it is owned by other teams.
That data set is not sorted.
Thanks
This might be a little more efficient:
where START_from_dtm = "31JAN2020:00:00:00"DT <= START_to_dtm
@SASKiwi 's syntax will make all he difference in the world:
- No function to slow down the filtering
- The index will be used
Edit: No index. I confused the variable names, sorry.
Still, using no function will make a significant difference.
Hi Saskiwi,
I have used that one too. but it hasn't changed much in the performance time.
Thanks
> I have used that one too. but it hasn't changed much in the performance time.
Well you need to help your self here. We still haven't seen the actual code (the code posted can't work), or the log or know the row length.
Review the questions asked and answer them.
@Learn_SAS1 - As already suggested by others please post your complete SAS log including the notes about run times.
@Learn_SAS1 wrote:
effective_from_dtm has values like DDMONYYYY:00:00:00.000.
As I have observed most of the values has 00 as time.
Index, I have mentioned in my original post, effective_from_dtm has index and no other indexes.
I can't create Index on that table as it is owned by other teams.
That data set is not sorted.
Thanks
You've got a source dataset with 322,596,778 records, and need to filter on start_from_dtm <= some_static_date <= start_to_dtm. Typical SCD2 type filtering.
Yet your source dataset doesn't have an index on that column.
Have you have explained the performance issue to your "other teams", requested an index be created on that column, and they refused?
That's going to be your best approach to improving the performance of this query. Google how indexes work if you need further details.
Do you have the disk space to (as a proof-of-concept) create a copy of that dataset, create your own index on start_from_dtm start_to_dtm, then test the performance results?
If you do, you may also want to test performance if the data is stored using the SPDE engine.
@Learn_SAS1 wrote:
Hi,
I am currently using SAS Data step.
Data test1; /* loading approx 17 lakhs records */
Set source (keep=var1 var2 var3); /* source data set has 322.596.778 obs */
where DATEPART(START_from_dtm)="31JAN2020"D <=datepart(START_to_dtm);
run;
The above data step is running for 25 minutes. Is there any solution to execute the query faster.
START_from_dtm has index on it. So tried by keeping only START_from_dtm column but same time.
I am using the same query multiple times and the execution is increasing.
Thanks in advance.
In your original-code the variable used in the where-statement are also in the keep-list, right?
Reading a dataset with 332 million obs needs some time. My before-coffee-calculation showed, that about 200000 obs are processed per second, that is not that bad at all.
Please post the complete log from that step, so we can see the time summary; use the </> button to post the log.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.