Help using Base SAS procedures

row count

Reply
Super Contributor
Posts: 326

row count

hi,

how can i store the obs count of a dataset to a macro variable when i use proc datasets?

i know i saw a sample of this code back then but now i couldnt find it anymore Smiley Sad

thanks a lot!
Super Contributor
Super Contributor
Posts: 3,174

Re: row count

The two processes are inconsistent. There are SAS CALL functions and DICTIONARY members(s) and the SET stmt NOBS= parameter, used with a CALL SYMPUT for the purpose mentioned.

Scott Barry
SBBWorks, Inc.
Super Contributor
Posts: 326

Re: row count

hi scott,

thanks for the info, but in what way are these approaches inconsistent?

is there any other way other than the select count into: statement in getting the number of obs of a certain dataset?

thanks,
milton
Super Contributor
Posts: 474

Re: row count

Yes there is, and much more efficient too.

Both DICTIONARY tables method, PROC CONTENTS (you may have confused this with PROC DATASETS), or even a DATA STEP with the NOBS options will retrieve immediately the count of obs (EXCEPT if the table is a VIEW or DBMS table) WITHOUT the need to scan the whole data as the SELECT COUNT INTO: technique will need to do.

Check online documentation for DICTIONARY tables and the CONTENTS procedure:
http://support.sas.com/documentation/cdl/en/lrcon/61722/HTML/default/a002299818.htm
http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/a000085768.htm

Cheers from Portugal.

Daniel Santos @ www.cgd.pt.
Super Contributor
Posts: 326

Re: row count

Posted in reply to DanielSantos
Thanks Daniel that's what I'm looking for.

Regards,
Milton
SAS Employee
Posts: 3

Re: row count

A succinct and efficient way is the use of macro and SCL functions.;

%LET DSID = %SYSFUNC(OPEN(TEMP.SAMPLE,IN));
%LET NUM = %SYSFUNC(ATTRN(&DSID,NLOBS));
%IF &DSID > 0 %THEN %LET RC=%SYSFUNC(CLOSE(&DSID));

%put There are &num observations in the dataset;

* The first line opens the data set and the last one closes it. this is needed because you are not using data step or
SCL and so could leave a data set open, causing problems later. The second line is what captures the number of observations
from the header of the data set using the SCL ATTRN function called by %SYSFUNC.

* with where clause;
%let dsid = %sysfunc(open(temp.sample (where=(x>500000))));
%let num = %sysfunc(attrn(&dsid,nlobs));
%if &DSID > 0 %then %let rc = %sysfunc(close(&dsid));
Valued Guide
Posts: 2,177

Re: row count

Succint and with a where clause!
Introduced with SAS9 (iirc) attrn NLOBSF will provide the NOBS as fast as normal for a normal data set, and instead of missing, 0 or -1 for a dataset subject to a where clause, it will go and make the count! here is a log snippet of my demo, based on the code from rob@sas
238 data temp.sample ;
239 do x= 1 to 1e7 ;
240 ranp = 1+ nobs*ranuni(1) ;
241 classn = RANP ;
242 set sashelp.class point = ranp nobs= nobs;
243 output ;
244 end;
245 stop;
246 run;

NOTE: The data set TEMP.SAMPLE has 10000000 observations and 7 variables.
NOTE: DATA statement used (Total process time):
real time 22.68 seconds
user cpu time 6.40 seconds
system cpu time 2.18 seconds
Memory 181k


247 %let start = %sysfunc( datetime(),best18 ) ;
248 %let dsid = %sysfunc(open(temp.sample ));
249 %let num = %sysfunc(attrn(&dsid,nlobsf));
250 %let rc = %sysfunc(close(&dsid));
251 %let here1 = %sysfunc( datetime(),best18 ) ;
252 %let durn1 = %sysevalf( &here1 - &start ) ;
253 %put without where rc=&rc dsid= &dsid num=&num durn= &durn1 %now ;
without where rc=0 dsid= 1 num=10000000 durn= 0 07AUG2009:12:33:38.681
254
255 %let star2 = %sysfunc( datetime(),best18 ) ;
256 %let dsid = %sysfunc(open(temp.sample (where=(x>500000))));
257 %let num = %sysfunc(attrn(&dsid,nlobsf));
258 %let rc = %sysfunc(close(&dsid));
259 %let here2 = %sysfunc( datetime(),best18 ) ;
260 %let durn2 = %sysevalf( &here2 - &star2 ) ;
261 %put with where rc=&rc dsid= &dsid num=&num durn= &durn2 %now ;
with where rc=0 dsid= 1 num=9500000 durn= 25.1400001049041 07AUG2009:12:34:03.821

With a WHERE clause, interestingly, as durn= a smaller number when the where clause returns a small number of rows, and larger when there are more to count ...
342 %let star2 = %sysfunc( datetime(),best18 ) ;
343 %let dsid = %sysfunc(open(temp.sample (where=(x>9500000))));
344 %let num = %sysfunc(attrn(&dsid,nlobsf));
345 %let rc = %sysfunc(close(&dsid));
346 %let here2 = %sysfunc( datetime(),best18 ) ;
347 %let durn2 = %sysevalf( &here2 - &star2 ) ;
348 %put with where rc=&rc dsid= &dsid num=&num durn= &durn2 %now ;
with where rc=0 dsid= 1 num=500000 durn= 3.375 07AUG2009:15:34:58.682

349 %let star2 = %sysfunc( datetime(),best18 ) ;
350 %let dsid = %sysfunc(open(temp.sample (where=(x>1))));
351 %let num = %sysfunc(attrn(&dsid,nlobsf));
352 %let rc = %sysfunc(close(&dsid));
353 %let here2 = %sysfunc( datetime(),best18 ) ;
354 %let durn2 = %sysevalf( &here2 - &star2 ) ;
355 %put with where rc=&rc dsid= &dsid num=&num durn= &durn2 %now ;
with where rc=0 dsid= 1 num=9999999 durn= 24.2490000724792 07AUG2009:15:35:41.119


it appears that where-clause handling is more efficient than the counting. ~~~~~~


PeterC Message was edited by: Peter.C
Super Contributor
Posts: 326

Re: row count

wow! thanks rob and peter. learned something new from your posts.

regards,
milton
Ask a Question
Discussion stats
  • 7 replies
  • 203 views
  • 0 likes
  • 5 in conversation