BookmarkSubscribeRSS Feed
Ronein
Meteorite | Level 14

Hello

I saw 2 ways to create a macro var with value of number of observations.

My question- What is the advantage of way 1 over way 2?

I see that both ways work but way2 is easier code.

What does it mean "if 0 "?

 

/***Way1***/
data _null_;
if 0 then set sashelp.class nobs=n;
call symput('num_obs',n);
stop;
run;
%put &num_obs;

/***Way2***/
data _null_;
set sashelp.class nobs=n;
call symput('nr_obs',n);
stop;
run;
%put &nr_obs;
7 REPLIES 7
ballardw
Super User

"Way 2" will pull every observation and repeatedly assign the same value to the macro variable. If the data set is millions of observations it could take quite a while and waste lots of CPU. "Way 1" will take about the same amount of time regardless of the number of observations.

Quentin
Super User

There is little practical advantage of Way 1 vs Way 2, other than clarity of code / purpose.

 

In this step you don't have to read any data from sashelp.class.  When the SET statement compiles, it will assign the value to n.  So the SET statement does not need to execute.

 

With Way 1, the code:

if 0 then set ... ;

is a way to prevent the SET statement from executing.  It's a usual IF THEN statement.  IF 0 is false, so the SET statement never executes.

 

It's common to use IF 0 then SET in a case where you don't want to read data from a dataset, but you do want metadata (e.g. variable names, or nobs in this case) the DATA step compiles.

 

In practice, Way 1 will not read any date from sashelp.class. Way 2 will read just one record from sashelp.class, because the STOP statement will end execution of the DATA step before the second record is read.  So you won't see a noticeable difference in execution time between the two methods.

 

Still, Way 1 is generally preferable, as IF 0 THEN SET is a familiar recognizable coding pattern, that will make it easier for others to read your code.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Kurt_Bremser
Super User

Way 1 will not read an observation, way 2 will read the first. Since that obs is contained in the dataset header page, which has to be read physically anyway, the performance difference will be negligible.

So it's up to what is easier to understand for the next coder who comes across the program.

But for this, I'd prefer

proc sql noprint;
select nobs into :num_obs
from dictionary.tables
where libname = "SASHELP" and memname = "CLASS";
run;
WarrenKuhfeld
Rhodochrosite | Level 12

Use symputx not symput. The former is designed to do automatic numeric to character conversions.

Tom
Super User Tom
Super User

Since you have a STOP statement you don't need the IF/THEN.  Just move the SET after the STOP.

data _null_;
  call symputx('num_obs',n);
  stop;
  set sashelp.class nobs=n;
run;
%put &num_obs;

PS Do not use the ancient CALL SYMPUT() method unless you actually have a need to generate macro variables that contain leading and/or trailing spaces.  It was replaced by CALL SYMPUTX() decades ago.

s_lassen
Meteorite | Level 14

The first method (or the even simpler code shown by @Tom ) is the best. The second method will not execute the SYMPUT statement if the number of observations is 0, which may cause problems later.

 

And yes, as others have already suggested, use SYMPUTX, not SYMPUT. In the actual example, the difference will be that the SYMPUT-ed variable will contain trailing blanks, the other will not.

Kurt_Bremser
Super User

One additional thing to consider: the DATA step methods will fail with an ERROR if the dataset does not exist, the SQL query will simply not return a value. Depending on context, the SQL method might be an easier way to deal gracefully with a missing dataset (particularly important for batch jobs which should continue executing).

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 866 views
  • 2 likes
  • 7 in conversation