hi ... not sure what part of the SAS code the question is about
if it's about how to add a sequence number, it's a good way to learn about a lot of
stuff that goes on in a data step
here's a data step that tries to create 7 sequence numbers
if you understand the output, you know (learn ?) a lot
data males;
retain n4 0 n5 n7;
set sashelp.class;
if sex eq 'M';
n1=_n_;
n2+1;
n3=n3+1;
n4=n4+1;
n5=n5+1;
n6=sum(n6,1);
n7=sum(n7,1);
keep n: ;
run;
proc print data=males;
var name n1-n7;
run;
Obs Name n1 n2 n3 n4 n5 n6 n7
1 Alfred 1 1 . 1 . 1 1
2 Henry 5 2 . 2 . 1 2
3 James 6 3 . 3 . 1 3
4 Jeffrey 9 4 . 4 . 1 4
5 John 10 5 . 5 . 1 5
6 Philip 15 6 . 6 . 1 6
7 Robert 16 7 . 7 . 1 7
8 Ronald 17 8 . 8 . 1 8
9 Thomas 18 9 . 9 . 1 9
10 William 19 10 . 10 . 1 10
n1 / that is based on _n_ , an automatic SAS variable that counts passes through
the data step and there are 19 passes since there are 19 observations in data
set SASHELP.CLASS ... one feature, these are the original observation numbers
from data set SASHELP.CLASS
n2 / there are a lot of features here ... this construct (var + 1) implies two things: first, the
variable is automatically retained (not set to missing each pass back to the top
of the data step); second, the initial value of var is 0 ... so this is like have the statement
retain n2 0;
in the data step ... but you do not have to write that statement
n3 / does not work since the initial value of n3 when that statement is first executed is
MISSING and adding anything to a missing value gives a missing result
n4 / that works fine: the initial value of n4 is set to 0 in the retain statement;
the value of n4 is retained and not set to missing each pass back to the top of the data step
so n4 is the same as n2 ... but using n2+1 instead of n4=n4+1 means that you do not
need the retain statement for n2
n5 / does not work ... the value of n5 is retained, but the initial value of n5 is missing and
if you add anything to a missing value, the result is missing
n6 / does not work ... notice that the result is always 1 since functions (SUM) ignore missing
values, so when you add 1 to a missing value you get 1 ... but n6 is not retained so it
always gets set back to missing at the top of the data step
n7/ that works since it uses the SUM function like n6, but the value is retained
but ... if your question is about why _n_ works with WHERE and not with IF, just look at the LOG
after you run both jobs (one with IF and one with WHERE)
the one with IF makes 19 passes through the data step since every observation in SASHELP.CLASS is processed by the SET statement ... the one with WHERE makes 10 passes through the data step since the WHERE statement can be thought of as "peeking" at your data set to see if it actually has to process an observation in the data set ... if the WHERE statement is FALSE, the observation is never "seen" by the SET statement
hope all that makes sense