I feel that I'm confused by how _N_ works. When reading the Little SAS book, it mentions that _N_ not necessarily equal to the observation number, can someone list some examples to demonstrate that point ? Thank you !
Did you miss to read the full sentence in that very book?
_n_ indicates the number of times SAS has looped though the datatep. This is not necessarily equal to the observation number(in the output dataset) since simple subsetting IF statement can change the relationship between observation number and number of iterations of the data step
Run the following and notice the difference between Obs and K. Read the above description again after running the code.
data want;
set sashelp.class ;
if sex='M';
k=_n_;
run;
proc print;run;
Did you miss to read the full sentence in that very book?
_n_ indicates the number of times SAS has looped though the datatep. This is not necessarily equal to the observation number(in the output dataset) since simple subsetting IF statement can change the relationship between observation number and number of iterations of the data step
Run the following and notice the difference between Obs and K. Read the above description again after running the code.
data want;
set sashelp.class ;
if sex='M';
k=_n_;
run;
proc print;run;
Thank you very much !
Another example:
data test;
do until eof;
set sashelp.class end=eof;
number= _n_;
output;
end;
run;
Thanks so much for another example !
data test;
do until eof;
set sashelp.class end=eof;
number= _n_;
output;
end;
run;
sorry I did not got it, how come the value of number is always 1, although the whole dataset is being read. Can someone please explain
@ruchi11dec wrote:data test; do until (eof); set sashelp.class end=eof; number= _n_; output; end; run;
sorry I did not got it, how come the value of number is always 1, although the whole dataset is being read. Can someone please explain
Keep in mind that _n_ is incremented only when the data step begins a new iteration.
What happens here is this:
At the start of the first data step iteration, _n_ is set to 1.
Then the do loop executes; since we stay within the first data step iteration, _n_ does not change.
Then the do loop ends when the reading of the last observation sets eof to true, but we have not yet tried to read past the last observation.
A new data step iteration begins, _n_ is set to two, and since we have a do until loop, the loop is entered at least once. Now the set statement tries to read past eof, which triggers the (normal) termination of the data step.
You can verify this by adding a few put statements:
data test;
put 'before loop';
put _n_=;
do until (eof);
put 'in loop';
put _n_=;
set sashelp.class end=eof;
number= _n_;
output;
end;
run;
A nice variation on the theme is this:
data test;
put 'before loop';
put _n_=;
do while (not eof);
put 'in loop';
put _n_=;
set sashelp.class end=eof;
number= _n_;
output;
end;
run;
Since now the data step would never try to read past eof, the data step would iterate endlessly. But there's a "safety valve" built into the data step that recognizes such a condition and stops the step on its own. You see that in the log:
NOTE: DATA STEP stopped due to looping. NOTE: There were 19 observations read from the data set SASHELP.CLASS. NOTE: The data set WORK.TEST has 19 observations and 6 variables.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.