BookmarkSubscribeRSS Feed
deleted_user
Not applicable
To me Proc sql seems slow when doing a query against a SAS table, compared to doing the processing in a data step.

The code will be more compact and easier to write in Proc sql, but the speed ...

What's your impression regarding this?
5 REPLIES 5
Doc_Duke
Rhodochrosite | Level 12
As they say, "your mileage may vary."

There are some things that are quicker and others slower. I'll take compact and easier to write anytime and boost the hardware... We use SQL on huge Medicare claims in SAS datasets and the performance is acceptable.
DanielSantos
Barite | Level 11
When dealing with small amount of data, I always use SQL.

Medium to average, I may still consider SQL for simple tasks, otherwise it's datastep/procs.

For large to huge amount of data, I'll stick to datastep/procs, unless I got on my hands some N to N match to perform.

Datastep gives you a lot of more control on how things will be performed.
With SQL, although you may influence the optimizer, it's kinda of a "black box".

But then again, it depends on your hardware architecture, on the amount of data, and on what you are trying to do with it.

Cheers from Portugal.

Daniel Santos @ www.cgd.pt
Santhosh_CHN
Calcite | Level 5

Same with me here. I have also seen data Step executing faster than PROC SQL. But, Data Step may produce more lines of code than PROC SQL.

jakarman
Barite | Level 11

There are major differences in the datastep approach vs SQL approach.

Touching all data the datastep should be faster as doing one ordered speed up IO stream on the background. But it is single threaded.

The single threaded approach is required as the data is processed ordered. You have lag/retain fucntionality aside first/last (missing in SQL).

Doing small part of work SQL can win as this will work better on isolated records.

It will execute Multi-threaded (more processes) and cause more randomness in IO causing wait-seek time.

If you have SAS dataset processed with SQL you can also specify indexes and constraints like every DBMS.

Also modfiy records in place instead of replacing datasets is possible.  

---->-- ja karman --<-----
Reeza
Super User

My personal rule(s):

If I'm working with the data on a row level then I use a data step, if I'm working with the columns, doing aggregates/counts I use SQL.

I generally do merges with SQL, unless I need to identify the incoming data set then I use a DATA STEP.

If data is on a server and there's the option, I'll use SQL Pass Through for bigger datasets (anything that takes longer than 30 secs) whenever possible.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 947 views
  • 0 likes
  • 6 in conversation