Succint and with a where clause!
Introduced with SAS9 (iirc) attrn NLOBSF will provide the NOBS as fast as normal for a normal data set, and instead of missing, 0 or -1 for a dataset subject to a where clause, it will go and make the count! here is a log snippet of my demo, based on the code from rob@sas
238 data temp.sample ;
239 do x= 1 to 1e7 ;
240 ranp = 1+ nobs*ranuni(1) ;
241 classn = RANP ;
242 set sashelp.class point = ranp nobs= nobs;
243 output ;
244 end;
245 stop;
246 run;
NOTE: The data set TEMP.SAMPLE has 10000000 observations and 7 variables.
NOTE: DATA statement used (Total process time):
real time 22.68 seconds
user cpu time 6.40 seconds
system cpu time 2.18 seconds
Memory 181k
247 %let start = %sysfunc( datetime(),best18 ) ;
248 %let dsid = %sysfunc(open(temp.sample ));
249 %let num = %sysfunc(attrn(&dsid,nlobsf));
250 %let rc = %sysfunc(close(&dsid));
251 %let here1 = %sysfunc( datetime(),best18 ) ;
252 %let durn1 = %sysevalf( &here1 - &start ) ;
253 %put without where rc=&rc dsid= &dsid num=&num durn= &durn1 %now ;
without where rc=0 dsid= 1 num=10000000 durn= 0 07AUG2009:12:33:38.681
254
255 %let star2 = %sysfunc( datetime(),best18 ) ;
256 %let dsid = %sysfunc(open(temp.sample (where=(x>500000))));
257 %let num = %sysfunc(attrn(&dsid,nlobsf));
258 %let rc = %sysfunc(close(&dsid));
259 %let here2 = %sysfunc( datetime(),best18 ) ;
260 %let durn2 = %sysevalf( &here2 - &star2 ) ;
261 %put with where rc=&rc dsid= &dsid num=&num durn= &durn2 %now ;
with where rc=0 dsid= 1 num=9500000 durn= 25.1400001049041 07AUG2009:12:34:03.821
With a WHERE clause, interestingly, as durn= a smaller number when the where clause returns a small number of rows, and larger when there are more to count ...
342 %let star2 = %sysfunc( datetime(),best18 ) ;
343 %let dsid = %sysfunc(open(temp.sample (where=(x>9500000))));
344 %let num = %sysfunc(attrn(&dsid,nlobsf));
345 %let rc = %sysfunc(close(&dsid));
346 %let here2 = %sysfunc( datetime(),best18 ) ;
347 %let durn2 = %sysevalf( &here2 - &star2 ) ;
348 %put with where rc=&rc dsid= &dsid num=&num durn= &durn2 %now ;
with where rc=0 dsid= 1 num=500000 durn= 3.375 07AUG2009:15:34:58.682
349 %let star2 = %sysfunc( datetime(),best18 ) ;
350 %let dsid = %sysfunc(open(temp.sample (where=(x>1))));
351 %let num = %sysfunc(attrn(&dsid,nlobsf));
352 %let rc = %sysfunc(close(&dsid));
353 %let here2 = %sysfunc( datetime(),best18 ) ;
354 %let durn2 = %sysevalf( &here2 - &star2 ) ;
355 %put with where rc=&rc dsid= &dsid num=&num durn= &durn2 %now ;
with where rc=0 dsid= 1 num=9999999 durn= 24.2490000724792 07AUG2009:15:35:41.119
it appears that where-clause handling is more efficient than the counting. ~~~~~~
PeterC
Message was edited by: Peter.C