Hi,
I have the following code that I use to get the number of records in my INPUT file. I was wondering if there is a better (or newer) way to do this.
Thanks, Nancy
So we can simplify the code even more by using the automatic counter _N_, we just need to remember to subtract one for the extra time it starts the step.
data _null_;
if eof then call symputx('nobs',_n_-1);
infile example end=eof ;
input;
run;
If the recCount number being kept in your dataset is not important that you could instead use the automatic record number counter variable _n_
data x;
infile prange_t end=last;
input;
if last then call symputx('cntr04p',put(_n_,best.));
run;
%put &cntr04p;
I agree but with one change that Howles pointed out a couple of weeks ago. We really ought to stop recommending using "best" as a format as it is really just a substitute for the true value (e.g., 32.) and calling it best leads one to think that it is doing something special.
In this specific case these act as substitutes however there are definitly significant differences between the bestw. and w.d formats.
FriedEgg: Can you point me to a link that describes what those differences might be? Howard usually has very good reasons for making the comments he makes.
I was talking about the BEST informat, not format, or at least that was my intent.
art297 wrote:
I agree but with one change that Howles pointed out a couple of weeks ago. We really ought to stop recommending using "best" as a format as it is really just a substitute for the true value (e.g., 32.) and calling it best leads one to think that it is doing something special.
Then I stand (actually sit) corrected! Howard, you've forced me to do the dreaded task of reading the manual.
You are right, of course, but now that I've read at least that section of the manual, I'd recommend recommending the bestd format. The link to the documentation is: http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a003171706.htm
Yes, the bestw.d informat is a direct alias of the w.d informat, an important distinction... Outputs from the formats w.d, bestw.d, and bestdw.p for integer values should be virtually identical until the integer values reach extremes and the formats start converting to scientific notation at different points and sometimes with different percision that eachother. The output of the bestw.d and bestdw.p (when the same w is used and default p value) should be identical for all integers.
Not sure if there is a better way (other than using PIPE to call a system tool such the wc command of Unix).
You should use DATA _NULL_ rather than creating an actual dataset.
You should move your IF loop to before the INPUT statement to handle the case when the file is empty.
You should update CALL SYMPUT to CALL SYMPUTX which will automatically convert your numeric variable to a character string for storing in the macro variable.
filename prange_t 'poutofrngpi07.lis';
data _null_;
if last then call symputx('cntr04p', recCount);
INFILE prange_t END=last;
input;
recCount + 1;
run;
%put Records in PRANGE_T = &cntr04p;
Tom,
Can you explain to me how moving the IF statement before the INFILE statement works!? I am confused about this since it hasn't even attempted to read the file yet. I've seen it written that way before in examples and don't understand it. The DATA _NULL_ part I was going to do because all I need is the count.
Thanks, Nancy
I think of it as SAS looking ahead and setting the END= variable before it tries to read the next line.
SAS will actually exit the data step at the INPUT statement when it reads past the end of the file. So for a file with N lines the data step starts N+1 times.
Here is an example SAS log from reading a three line file.
195
196 data _null_;
197 put 'TOP ' (_n_ eof) (=);
198 infile example end=eof ;
199 input ;
200 put 'BOTTOM ' (_n_ eof) (=);
201 run;
NOTE: The infile EXAMPLE is:
Filename=...\Temp\SAS Temporary Files\_TD7068\#LN00016,
RECFM=V,LRECL=256,File Size (bytes)=9,
Last Modified=25Oct2011:12:32:57,
Create Time=25Oct2011:12:32:57
TOP _N_=1 eof=0
BOTTOM _N_=1 eof=0
TOP _N_=2 eof=0
BOTTOM _N_=2 eof=0
TOP _N_=3 eof=0
BOTTOM _N_=3 eof=1
TOP _N_=4 eof=1
So we can simplify the code even more by using the automatic counter _N_, we just need to remember to subtract one for the extra time it starts the step.
data _null_;
if eof then call symputx('nobs',_n_-1);
infile example end=eof ;
input;
run;
OK, so what it is doing is trying to set the EOF variable, is that because the first PUT statement has the eof in it? And the _N_ seems to be wrong because you said it is a three line file, but _N_ finishes as with four. That's why I didn't use _N_, but used a counter instead.
Nancy
The _N_ is right, but it does not count observations. It counts interations of the data step.
The main point of testing before the input statement is that SAS stops when it reads past the end of file, so the statements following the input statement never execute for an empty input file.
Tom,
Thanks. Your explanation really helped me.
Nancy
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
