BookmarkSubscribeRSS Feed

Determining the number of observations in a SAS data set efficiently

Started ‎07-03-2018 by
Modified ‎09-12-2018 by
Views 112,746

SAS data sets consist of a descriptor portion and a data portion that contains the data values. The descriptor portion of a SAS data set contains detailed information about the data set. This information includes:

  • The name of the data set and its member type
  • The date and time the data set was created
  • The number of observations
  • The number of variables
  • The engine type

Reading the descriptor portion is one of the most efficient and quickest ways of determining the number of observations in a SAS data set.  Here are some examples:

 

/* SAS V9.x and higher */
data _NULL_;
	if 0 then set sashelp.class nobs=n;
	call symputx('nrows',n);
	stop;
run;
%put nobs=&nrows;
  • The "if 0" is a a conditional statement designed to never be executed at run-time and works since the "set sashelp.class" gives the header information of the data set class to the compiler which sets up the PDV accordingly, but is skipped in the execution time because the if condition is false.
  • nobs is a SAS automatic variable which contains the number of records in the data set named in the set statement. The code, nobs = n, places the value of nobs (the number of records in the sashelp.class dataset) in the variable n.
  • The STOP statement is used to prevent an endless loop because EOF is never reached.

Checking for empty data set

Many times users don't need to know how many obs there are, just whether there are any. In such cases, a data step like the following might be just what is needed. This technique also has the advantage that it works for views and other data sets where the number of observations is not available in the data set header (Don Henderson's tip from sascommunity.org).

 

data _null_;
 call symput('AnyObs','0'); /* set to 0 for No */
 set &data(obs=1);
 call symput('AnyObs','1'); /* if got here there are obs so set to 1 for Yes */
run;

 

Access the descriptor information in SAS macro

The metadata can also be accessed by the macro language using DATA step functions. The ATTRN function is used with the NLOBS argument in the %OBSCNT macro shown below to retrieve the number of non-deleted observations from the meta data.

 

%macro obscnt(dsn);
%local nobs dsnid;
%let nobs=.;
 
%* Open the data set of interest;
%let dsnid = %sysfunc(open(&dsn));
 
%* If the open was successful get the;
%* number of observations and CLOSE &dsn;
%if &dsnid %then %do;
     %let nobs=%sysfunc(attrn(&dsnid,nlobs));
     %let rc  =%sysfunc(close(&dsnid));
%end;
%else %do;
     %put Unable to open &dsn - %sysfunc(sysmsg());
%end;
 
%* Return the number of observations;
&nobs
%mend obscnt;

 

Using DICTIONARY tables and PROC SQL

SAS dictionary tables can also be used to load the number of observations into a macro variable:

 

proc sql noprint;
	select nobs into :nobs separated by ' ' from dictionary.tables
	where libname='SASHELP' and memname='CLASS';
quit;
 
%put TNote:  nobs=&nobs;

 

Thanks to Alberto Negron and Tom Robinson for posting this help on sascommunity.org!

Comments

It depends on the engine . If you have SPDS ,you can't get nobs by your code.

Then @Ksharp let me tell you and others that for SPDS tables you can find this information in dictionary.tables when performing an explicit pass through.
http://documentation.sas.com/?docsetId=spdsug&docsetTarget=p0zjxy7y9lagkxn1ad49k8xcsrvm.htm&docsetVe...

I wonder if the proc contents wouldn't be faster than dictionary tables (including SPDE) ?

 

e,g.

libname X SPDE "E:\SAS_WORK_5400\SPDE";

data X.class;
  do _N_ = 1 to 1e6;
    do POINT = 1 to NOBS;
      set sashelp.class NOBS=NOBS POINT=POINT;
      output;
    end;
  end;
  STOP;
run;

proc contents noprint
  data = x.class
  OUT = OUT(keep=nobs)
;
run;
data _NULL_;
	set OUT(obs=1);
	call symputx('nrows',nobs);
	stop;
run;

%put &=nrows;

In the Checking for empty data set code, won't the first call symput run for every iteration of the data step, but the second one will fail to run on the last iteration of the data step?  So won't the value of the AnyObs macro variable always be set to '0', regardless of the number of observations read?  Looking at the log after running the code below seems to confirm this.

 

data _null_;
    put _N_;
    put "A";
    call symput('AnyObs','0'); 
    put "B"; /* this will always be the last thing written to the log */
    set &data(obs=1);
    put "C";
    call symput('AnyObs','1'); 
    put "D";
run;

 

If the engine does not support the ATTRN with NLOBS, because the dataset is a view, or has an active where clause, or the input is a database table then the following alternative is useful. Of course  we could use and SQL query with count(*), but then the macro can no longer be used as a function in syntax like

%if %ut_nlobs(...) gt  0 %then %do;

    /* Handle table with more than 0 rows  */

%end;

 

%macro ut_nlobs(
 dataset_pos_parm,  /* table as postional parameter */
 dataset=/* in: data set to check obervations in*/,
 table = /* alias parameter */,
 data=   /* alias parameter */,
 open_fail_err = Y /* If N then the word NOTE is added in front
           of sysmsg if the table cannot be opened.
           Sysmsg generally starts with ERROR.   Use this
           parameter when you dont want an error on the log. */
);
 /*Module------------------------------------------------------------------------------------------
 + determines no of (non-deleted) records in a table
 +
 + can be used anywhere in a SAS program including within SAS proc or data step. If the table does             
 + not exist, it returns -1. If the table exists, but cannot be opened it returns -999      
 + Allow dfferent parameter names, also single positional parm is allowed!  
 +------------------------------------------------------------------------------------------Module*/
 /*Hist---------------------------------------------------------------------------------------------
 + created / adapted Dave Prinsloo 
 + Jun2016 Dave Prinsloo
 +         added open_fail_err option 
 +--------------------------------------------------------------------------------------------Hist*/ 
 /*cols-10--------20--------30--------40--------50--------60--------70--------80--------90--------*/
 %local dsid /* Dataset-Id */
        n    /* Anzahl Saetze */
        rc   /* returncode */
        open_fail_msg_word 
        ;
%put NOTE: Executing &sysmacroname. for table = &dataset_pos_parm. &table. &data. &dataset.;
%if %length(&dataset_pos_parm.) ne 0 %then %let dataset =  &dataset_pos_parm. ;
%else %if %length(&table.) ne 0 %then %let dataset =  &table.;
%else %if  %length(&data.) ne 0 %then %let dataset =  &data.;
      
 %let dsid = %sysfunc(open(&dataset));
 
 %if &dsid %then %do; /* get number of logical observations */
      %let n = %sysfunc(attrn(&dsid,NLOBS));
      %if &n. = -1 %then 
         /* if the engine does not support NLOBS, then use 
	    * NLOBSF to count the records.  This has been tested 
	    * Oracle and SAP engines */
         %let n = %sysfunc(attrn(&dsid.,nlobsf));
      %let rc = %sysfunc(close(&dsid));
 %end; /* get number of logical observations */
 %else %do;
    %if %sysfunc(exist(&dataset)) %then %let n = -999;
    %else %let n = -1;
    /* Log error message- could not open dataset*/
    %if &open_fail_err. = N 
      %then %put NOTE: Macro &sysmacroname. setting dummy no of records as &dataset could not be opened; 
      %else %put %sysfunc(sysmsg());
 %end;&n
%mend ut_nlobs;

 

Tom

You can use this 30 year old macro https://github.com/sasutils/macros/blob/master/nobs.sas which has been updated over the years as SAS has added more functionality, like %SYSFUNC() and %SYMEXIST().

Version history
Last update:
‎09-12-2018 08:40 AM
Updated by:

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Labels
Article Tags