BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GN0001
Barite | Level 11

 

What are these codes do?

 

Data Mydata;

Set  mylib.table1 nobs = nobs;

Run;

 

Data _null_;

Set MyData nobos=nobs;

    Call symput('a_Variable_Name', compress(nobs));

    put nobs=;

    stop;

   Run;

Blue Blue
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

NOBS= is a set statement option that creates a temporary variable, the name after the =, which means the variable is not written to the output data set, that contains the total number of observations in the data set(s) that appear on the SET statement.

 

Call symput, and Call symputx, create macro variables. The first parameter is the name of the macro variable, the bit after the comma is the value that gets placed into a macro variable.

 

Put without any FILE statement in a data step writes text to the LOG window.

Stop means to stop executing the data step.

 

The given example would create a macro variable named "a_Variable_Name" and could be referenced in a later bit of code by using &a_Variable_Name.

One simple use would be:

%put &a_Variable_Name. ; after the data step which would write the text value of the macro variable to the Log as well. Note the % usually means a Macro language action and & starts the name of a macro variable for the macro processor to use.

View solution in original post

9 REPLIES 9
ballardw
Super User

NOBS= is a set statement option that creates a temporary variable, the name after the =, which means the variable is not written to the output data set, that contains the total number of observations in the data set(s) that appear on the SET statement.

 

Call symput, and Call symputx, create macro variables. The first parameter is the name of the macro variable, the bit after the comma is the value that gets placed into a macro variable.

 

Put without any FILE statement in a data step writes text to the LOG window.

Stop means to stop executing the data step.

 

The given example would create a macro variable named "a_Variable_Name" and could be referenced in a later bit of code by using &a_Variable_Name.

One simple use would be:

%put &a_Variable_Name. ; after the data step which would write the text value of the macro variable to the Log as well. Note the % usually means a Macro language action and & starts the name of a macro variable for the macro processor to use.

GN0001
Barite | Level 11
Thanks for the response:
What does compress(nobs) which is a value for the Macro variable 'a_Variable_Name' do?
What does put nobs=; do?

Many thanks,
Blue Blue,
Blue Blue
andreas_lds
Jade | Level 19

Just execute the code and have a look at the log. If the purpose of a function is unknown, read the documentation - much faster than waiting for answers.

Kurt_Bremser
Super User
Data Mydata;
Set  mylib.table1 nobs = nobs;
Run;

Since you make no use of the variable defined with the NOBS= option (see the documentation of the SET Statement), the dataset mydata will just be an exact replica of mylib.table1.

Data _null_;
Set MyData nobos=nobs;
    Call symput('a_Variable_Name', compress(nobs));
    put nobs=;
    stop;
   Run;

Since there is no NOBOS= option, this will simply fail with a syntax ERROR. Once you correct that, the step will create a macro variable "a_Variable_Name" with the CALL SYMPUT routine, containing the number or observations in the dataset mydata. It will also record this number in the log with the PUT Statement. Both of these actions (SYMPUT and PUT) will only happen if there is at least one observation in the dataset. If you wanted this to also work with zero observations, you need to move the CALL SYMPUT and PUT before the SET.

STOP will end the data step in the first observation, preventing an unnecessary complete read of the dataset.

 

Two of my Maxims are relevant to your post:

Maxim 1: Read the Documentation (that's why I included the links as entry points for your inquiry)

Maxim 4: If in Doubt, Do a Test Run and Look at the Results. If Puzzled, Inquire Further.

GN0001
Barite | Level 11
Hello,
What do you mean by saying:
"Since there is no NOBOS= option, this will simply fail with a syntax ERROR"
Respectfully,
Blue Blue
Blue Blue
GN0001
Barite | Level 11

The documentation was not simple to understand.
I will look into it more.
Respectfully,
Blue Blue

Blue Blue
Tom
Super User Tom
Super User

The first data step just copies the data.  I see this a lot in programs from inexperienced users and I never understand why.  Perhaps they have been taught to do this by other inexperienced users. One possible reason where it might help is when the original dataset is a view (or access to a remote database) such that the NOBS= option of the SET statement could not work.  But in that case if all you want is to count the observations you could do that without creating a copy of the data. See some ways to do that at the end of this answer.

 

The second data step does not create any output dataset because of the _NULL_ as the dataset name.

It has a typo in the SET statement.  Should be:

  set MyData nobs=nobs;

Or if you removed the useless first step it would be:

set mylib.table1 nobs=nobs;

This step calls the archaic CALL SYMPUT() function instead of the newer more powerful CALL SYMPUTX() function to create a macro variable named A_VARIABLE_NAME.

 

Two of the enhancements of CALL SYMPUTX() are important for this program.  One is it automatically strips leading and trailing spaces from the value being put into the macro variable.  Second it automatically converts numeric values into strings for putting in the macro variable. 

 

Using CALL SYMPUTX() eliminates the need for COMPRESS(), which is really the wrong function to use anyway.  The reason they might have used COMPRESS() was to remove the leading spaces that the automatic conversion of the NOBS numeric variable's value into a character string that CALL SYMPUT() requires for the second argument.  SAS will in most places automatically attempt to convert numbers to strings or strings to numbers when you use the wrong type of variable.  For number to character conversions it uses the BEST12. format.  So a count like 123 would become a string with 9 leading spaces so that its total length was 12 characters.  The COMPRESS() function allows three arguments and when you call it with only the one then it removes the spaces.

 

Instead of compress you could use the PUT() function to explicitly convert the number into a sting. And the STRIP() function to remove the leading spaces.

  call symput('a_variable_name',strip(put(nobs,32.)));

The STOP statement ends the data step when it runs.  So at most only one iteration of the data step will happen.

 

Note that placing the CALL SYMPUT() after the SET statement means that A_VARIABLE_NAME will not be created (or modified if it already exists) when MYDATA has zero observations.  When the SET statement tries to read an observation when it is already at the end of the data it stops the data step immediately.

 

Here is updated version using CALL SYMPUTX() and NOBS=.

data _null_;
  call symputx('a_variable_name',nobs);
  stop;
  set mylib.table1 nobs = nobs;
run;

 

If the NOBS= option does not work for your data then here is version that will actually read all of the observations so that you can count them instead.  You can use the END= option on the SET statement to set a variable that indicates that you are at the end of the input.  You can also use the _N_ automatic variable that the datastep sets to the number of iterations of the data step you are on.  You can use the dataset option DROP= and special variable list _ALL_ to not actually copy any of the data from the dataset since none of the variables are used by this step anyway.

data _null_;
  if eof then call symputx('a_variable_name',_n_-1);
  set mylib.table1(drop=_all_) end=eof ;
run;

 

Or you could use PROC SQL.  That will actually work best if MYLIB is pointing to an external database as PROC SQL will push the code to find the number of observations into the remote database to calculate.

proc sql noprint;
select count(*) format=32. into :a_variable_name trimmed
from mylib.table1
;
quit;
GN0001
Barite | Level 11
Thanks for all this. I have to think about each piece to digest it.
I think it would be easier to understand.
I didn't write the code and I was trying to understand it to be able to use it later. Well, I understood Data _null_; doesn't do anything, it only writes on the log.
Blue Blue
Blue Blue

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 2585 views
  • 4 likes
  • 5 in conversation