BookmarkSubscribeRSS Feed

10 Ways to Make Your SAS® Code Run More Efficiently Q&A, Slides, and On-Demand Recording

Started ‎09-25-2020 by
Modified ‎09-25-2020 by
Views 6,763

Did you miss the Ask the Expert session on 10 ways to make your SAS code run more efficiently? Not to worry, you can catch it on-demand at your leisure. 

 

Watch the webinar

 

Learn how to use your compute and SAS language concepts and processing to make your programs more efficient. In many cases, this translates to running faster. I’ll share 10 easy-to-understand, unique and effective tips for you. During this webinar, you will learn:

  • Methods for running SAS more efficiently.
  • How to control what resources are used.
  • How SAS works and how to better use resources.
AtE_10 Ways to Make SAS Code More Efficient.jpg

Here are the questions from the Q&A segment held at the end of the webinar. The slides from the webinar are attached.

 

Do we have debugger which shows me variable values step by step when I run SAS code? My SAS EG version is 7.15HF7.

Yes you do! In Enterprise Guide with a code tab open, select on the lime green bug on the right-hand side of the context sensitive toolbar just above the code. Then go to the bug just to the left of the code and select it. This will start the debugger which will allow you to step through one line at a time.

 

How does IF THEN ELSE compare to using CASE WHEN?

I don’t know and a quick search did not turn up anything obvious. However, Case, When, Then constructs are supported in SQL so one could assume that PROC SQL CASE statement would attempt to pass the query through to the database. This could mean faster processing times, but I did not find much on this.

 

Is there a difference between the keep/drop on the set statement or as an individual statement? (I almost always put mine at the bottom.)

Yes, there is a difference. When you specify the DROP= or KEEP= option in the SET statement, SAS does not read the excluded variables into the program data vector. If you work with a large data set (perhaps one containing thousands or millions of observations), then you can construct a more efficient DATA step by not reading unneeded variables from the input data set.

 

If you have more variables to keep and less to drop - then why not use drop = ?

I would use drop as well. The only reason to use a keep is to have a record of what is kept – it is part of the code.

 

Isn’t WHERE more dangerous? It executes on the set data instead of the actual data step.

Not sure what you mean by dangerous 😊. My point was that in the data step or procs using the WHERE statement might improve the efficiency of your SAS programs because SAS is not required to read all observations from the input data set.

 

Do character variables usually have 8 bytes, while numeric variables have 4 bytes?

SAS uses floating point representation and, by default, stores numeric values using 8 bytes. My example had 4 characters and that can mean only 4 bytes. The number of characters in a fixed number of bytes depends on what encoding your SAS session is using. If you are using a single byte encoding, like WLATIN1, then each character takes one byte. If you are using double byte characters (common with Japanese or Chinese) then each character needs two bytes. If you are using UTF8 encoding (also called unicode support) then a character can use between 1 and 3 bytes, depending on the character.

 

 

What is the difference between "proc dataset" and "data"?

Proc datasets is a procedure that is a utility procedure that manages your SAS files. With PROC DATASETS, you can do the following: copy SAS files from one SAS library to another, rename SAS files, repair SAS files, delete SAS files, list the SAS files that are contained in a SAS library, list the attributes of a SAS data set, etc. The DATA statement begins a DATA step and provides names for any output such as SAS data sets, views, or programs.

 

 

Thoughts on using the IF THEN DO to perform actions on subsets of obs versus using a more complex IF?

I found this neat summary on Stackoverflow: SAS evaluates the expression in an IF-THEN statement to produce a result that is either non-zero, zero, or missing. A non-zero and nonmissing result causes the expression to be true; a result of zero or missing causes the expression to be false.

If the conditions that are specified in the IF clause are met, the IF-THEN statement executes a SAS statement for observations that are read from a SAS data set, for records in an external file, or for computed values. An optional ELSE statement gives an alternative action if the THEN clause is not executed. The ELSE statement, if used, must immediately follow the IF-THEN statement.

Using IF-THEN statements without the ELSE statement causes SAS to evaluate all IF-THEN statements. Using IF-THEN statements with the ELSE statement causes SAS to execute IF-THEN statements until it encounters the first true statement. Subsequent IF-THEN statements are not evaluated. (Source: support.sas.com)

The DO statement is the simplest form of DO group processing. The statements between the DO and END statements are called a DO group. You can nest DO statements within DO groups.

A simple DO statement is often used within IF-THEN/ELSE statements to designate a group of statements to be executed depending on whether the IF condition is true or false.

Simplified you can say, if then is for one statement, if then do for a block of statements.

 

Is SELECT best coded with the conditions in the order of the frequency of the data (as an IF is)?

I did not find a definitive answer for this. I did find this nugget from this paper by Kirk Paul Lafler on PROC SQL: “When constructing a chain of AND-ed conditions in a WHERE clause, specify the most restrictive values first. By constructing AND-ed conditions in this way, CPU resources will be reduced.”

 

How do I know when I am using CEDA?  Is there a log message?  Do I have to code something?

Yes, there is a log message. You will see something to the effect “Data file XXX.XXX.XXX is in a format...” You do not need to code anything specifically to have this happen, it is automatic.

 

Came from 15 years of using SQL to now using SAS. I naturally leaned towards using PROC SQL statements a lot. Are using DATA Steps more efficient than PROC SQL?

Great question and one that has people with strong opinions on either side. I found a great blog that is a starting point but if you google this, there are lots of papers: https://blogs.sas.com/content/sastraining/2010/12/17/five-reasons-to-use-the-sas-data-step-or-proc-s....

 

When benchmarking the SAS program's performance, is there a way to save the performance results to a new SAS dataset?

It can be done using PROC PRINTTO to redirect the log output and some advanced coding techniques to create datasets. This paper is interesting although a bit old.

 

When you are writing a large dataset to a directory to save it for later - how can one make it faster? Can you use multithread?

Multithreaded capability is more for the processing of certain procedures including DS2 and FedSQL which I mentioned in the webinar. I suggest looking at the documentation to see examples of multithreaded output. SAS Datastep and PROC SQL are single threaded. If you have saved a file to a permanent library location, you can possibly index the dataset like I talked about in the webinar. Of course, there are caveats to that as well.

 

Will there be any talk on Management Console, how to use it efficiently, etc.?

Not during today’s session. I suggest the documentation as a good starting point.

 

Love to see more on PROC Summary, etc. to enhance efficiency?

I don’t know of any Ask the Expert sessions on this. A quick google finds a lot of links including a great paper by Art Carpenter. I do know that MEANS at its initial development was all about quickly and easily creating a table of summary statistics sent to a destination such as LISTING whereas SUMMARY was about creating an output dataset. They have merged somewhat in that the underpinnings of both now are the same.

 

Can you briefly talk about using DOW loop and hash tables to improve the efficiency of the SAS code?

I’m sorry but this is not an area I have expertise in. You are correct in assuming that hash tables can increase performance as the tables are loaded into memory.

 

What is the best way to track changes in SAS?

That is a question that can have many answers. One is using GIT. We have added GIT functionality to Base SAS via functions and to the later versions of Enterprise Guide and SAS Studio. Do a quick search on support.sas.com to see what we have implemented.

 

Is it ok to share the demo programs?

Are you asking about my programs that I showed? If so, they will be part of the PDF.

 

Is there a performance difference between data step vs proc sql? is there specific cases where one performs better than another?

Please see my answer above re PROC SQL.

 

When leveraging index function, can it be used while creating a large data set or should it be used on a large data set after it's been created?

Both the PROC SQL and PROC DATASETS options require the dataset already be created. You can, however, create a dataset and index at the same time using the DATA step. I ran a couple of tests and it does not seem to add a whole lot of time to the datastep to create the index at dataset creation. It does save the step of running either PROC SQL or DATASETS after the fact.

 

Does it help using PROC DATASETS to delete large files no longer being used?

PROC DATASETS can delete any size file and will automatically delete any indexes you may have created at the same time. You can also delete different versions of the dataset at the same time with this procedure as well. The documentation is very comprehensive in describing options.

 

How do we access SAS Analytics Explorers?

You can access SAS Analytics Explorers with this link.

 

 

Recommended Resources

Top Ten SAS® Performance Tuning Techniques 2012 Global Forum Paper

Top 10 SAS programming efficiencies

Leave Your Bad Code Behind: 50 Ways to Make Your SAS Code Execute More Efficiently

 

Want more tips? Be sure to subscribe to the Ask the Expert board to receive follow up Q&A, slides and recordings from other SAS Ask the Expert webinars.  

Version history
Last update:
‎09-25-2020 02:19 PM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Article Labels
Article Tags