BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SASdevAnneMarie
Barite | Level 11

Hello Experts,

 

My data is very huge, it takes 5 hours to run the code. Do you know some tricks to avoid this please ?

My code is :

 

proc sql;
	create table event2020 as
		select *
			from pkt.event
				where year(datepart(d_compta)) >= 2019
					and year(datepart(d_compta)) <= 2020;
quit;

Thank you very muche for your help !

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Bugger! It uses the Meta engine so you would need to use SAS SMC and look in SAS metadata how the library is actually defined and where it points to.

 

You could execute below. That shouldn't take too much time. The log will then tell you if the actual table is in a database. If it's in a database you'll see in the log some trials where SAS tells you which SQL syntax it sent to the database to test if it can execute it, and then also a message telling you which SQL has been chosen to be sent in the end.

...and if it's a SAS table then you won't have any of these "strange" messages in your SAS log.

options sastrace=',,,d' sastraceloc=saslog nostsuffix;

proc sql outobs=100;
	create table test_2 as
		select d_compta
			from pkt.event
				where '01jan2019 00:00:00'dt <= d_compta < '01jan2020 00:00:00'dt
				;
quit;

 

View solution in original post

20 REPLIES 20
LinusH
Tourmaline | Level 20
  • Have d_compta indexed
  • Use dt intervals instead of function call in the where cluase
Data never sleeps
LinusH
Tourmaline | Level 20

Furthermore, it might not be the filtering that is causing execution time. Performance is about the whole chain of data on disk, movement to memory and CPU ussage.

 

Is your data in a Base SAS libname? What kind of storage?

Add these options to better understand your bottlenecks:

options fullstimer msglevel=i;
Data never sleeps
SASdevAnneMarie
Barite | Level 11
Thank you, LinushH!
What do you mean by " d_compta indexed" and "Use dt intervals" please ?
Patrick
Opal | Level 21

First thing that jumps to my mind: Is your source table a SAS table or in a data base? Where does libref pkt point to?

 

If it's a database then most likely the where condition doesn't get pushed to the database (because of the use of the datepart() function) but all the data gets first transferred over the network to SAS and then only filtered on the SAS side. 

 

You could run below two queries to test the theory.

The 2nd query will push the where condition to the data base for execution (if your source table is in a database) so if the 2nd query is much faster then the required cause of action is clear.

IF the source table is in a database then the options I've added will show you in the SAS log which portion of the query gets pushed to the db for execution.

options fullstimer msglevel=i;
options sastrace=',,,d' sastraceloc=saslog nostsuffix;

libname pkt list;

proc sql;
	create table test_1 as
		select d_compta
			from pkt.event
				where year(datepart(d_compta)) = 2019
				;
quit;

proc sql;
	create table test_2 as
		select d_compta
			from pkt.event
				where '01jan2019 00:00:00'dt <= d_compta < '01jan2020 00:00:00'dt
				;
quit;

 

Ksharp
Super User
Or try this one :

proc sql;
create table event2020 as
select *
from pkt.event
where year(datepart(d_compta)) in (2019 2020);
quit;



Or event try data step like :

options cpucount=8 threads compress=yes bufno=100 bufsize=128k ;
data event2020;
set pkt.event;
if year(datepart(d_compta)) in (2019 2020);
run;


P.S. I would expect data step is faster than SQL .
SASdevAnneMarie
Barite | Level 11

Thank you, Ksharp ! The  bufno=100 bufsize=128k is it maximum size please ?

Ksharp
Super User

No. Actually I steal it from @ChrisNZ   . 

From his test, it looks like "bufno=100 bufsize=128k " is the best configure for running speed .

 

But you could ask @ChrisNZ 

ChrisNZ
Tourmaline | Level 20

Thank you @Ksharp🙂

 

> From his test, it looks like "bufno=100 bufsize=128k " is the best configure for running speed. But you could ask @ChrisNZ 

 

This result only works for the test data used in the book, though it gives a good starting point. Each data is different and if performance is important on a recurring basis for a given query, the best values for options engine, compression, buffer size, buffer number and OS cache should be benchmarked.

Patrick
Opal | Level 21

@SASdevAnneMarie 

So you're 100% sure that your source table is already a SAS table? If you execute...

libname pkt list;

...does that show you a pathname as the location in the SAS log or something else?

SASdevAnneMarie
Barite | Level 11
Thank you for your message. I see this:
Libref= PKT
Scope= IOM ROOT COMP ENV
Engine= META
Access= READONLY
Physical Name= PKT
Schema/Owner= PEGASE
Patrick
Opal | Level 21

Bugger! It uses the Meta engine so you would need to use SAS SMC and look in SAS metadata how the library is actually defined and where it points to.

 

You could execute below. That shouldn't take too much time. The log will then tell you if the actual table is in a database. If it's in a database you'll see in the log some trials where SAS tells you which SQL syntax it sent to the database to test if it can execute it, and then also a message telling you which SQL has been chosen to be sent in the end.

...and if it's a SAS table then you won't have any of these "strange" messages in your SAS log.

options sastrace=',,,d' sastraceloc=saslog nostsuffix;

proc sql outobs=100;
	create table test_2 as
		select d_compta
			from pkt.event
				where '01jan2019 00:00:00'dt <= d_compta < '01jan2020 00:00:00'dt
				;
quit;

 

SASdevAnneMarie
Barite | Level 11
Thank you Patrick,
I don't have any "strange" message in my SAS log.
Ksharp
Super User
Patrick,
If this table (pkt.event) was under Data Base, why not using Pass-Through SQL to let DB do the work?
Patrick
Opal | Level 21

@Ksharp That's exactly where I was heading to. The query @SASdevAnneMarie just executed together with the options would a) have pushed the where clause to the db and b) the SAS log would have told us that this is happening. But from what Marie tells us now it really looks like a SAS table.

 

@SASdevAnneMarie Is this SAS 9.4 or SAS Viya? And how many rows and columns does your source SAS table have? Like: Can you share the result of a Proc Contents? 5+ hours for such a simple query that doesn't require any sorting feels excessive.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 20 replies
  • 1797 views
  • 6 likes
  • 6 in conversation