I just became surprised about how it works when one increment a variable without using "=". The strange thing happens when I have missing values. In that case it seems that it take the last value that was not missing (from a previous observation). And this even happens if I first replace a possible missing value with some other number.
I expected that the two dataset "outdata1" and "outdata2" below would be the same, but they became very different. Only difference is that "n=n+1" is replaced by "n+1". I understand what happens but not the logic. It seems not intuitive.
data abc;
do i=1 to 10;
text=ifc( 5<=i<=8,'NR',' ');
output;
end;
keep text ;
run;
data outdata1;
set abc;
if text ne 'NR' then n=5;
if n=. then n=-5;
n+1;
run;
data outdata2;
set abc;
if text ne 'NR' then n=5;
if n=. then n=-5;
n=n+1;
run;
proc print data=outdata1 ;
run;
proc print data=outdata2 ;
run;
And the two dataset becomes
Obs | text | n |
---|---|---|
1 | 6 | |
2 | 6 | |
3 | 6 | |
4 | 6 | |
5 | NR | 7 |
6 | NR | 8 |
7 | NR | 9 |
8 | NR | 10 |
9 | 6 | |
10 |
6 |
and
Obs | text | n |
---|---|---|
1 | 6 | |
2 | 6 | |
3 | 6 | |
4 | 6 | |
5 | NR | -4 |
6 | NR | -4 |
7 | NR | -4 |
8 | NR | -4 |
9 | 6 | |
10 | 6 |
That's because it is initialized to zero, as the first answer noted. So it's never missing unless it's set explicitly to missing.
data _null_;
put n=;
if n=. then n=-5;
n+1;
put n=;
stop;
run;
Note the log contents. It starts out at *zero*.
It sounds like the software is working as expected. Here's some of the documentation:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000289454.htm
Basically, this statement has triple meaning:
varname + (some numeric value);
It means:
That sounds like the result you are getting, so it's to be expected.
+ is the Sum Operator; if you think of it that way, you will remember half of the reason for this at least!
+ does two things:
* Retain the variable on the left side of the operator
* Sum the two sides of the operator, just as the SUM function does
Remember, the SUM function considers missing values to be zero when added to a nonmissing value. This operator works the same way.
N+1;
Is an example of a SUM statement. The SUM statement takes the form
varname+expression;
The SUM statement will mark the target variable to be retained and will initialize it to 0. When it executes it is as if you used this statement instead
varname=sum(varname,expression);
The SUM function and the + operator work differently when one or more of the input is missing. The + operation will result in a missing value if either of the inputs are missing. The SUM() function will ignore missing values. So the result will only be missing if all of the inputs are missing.
Instead of replacing 'n+1' with 'n=n+1' you could have replaced it with these two statements.
retain n 0;
n=sum(n,1);
Thank you for your answers - I think they are all right.
But, why is it that when I replace the missing value in "n" with -5, then it instead used the retained value from last observation.
I see that the sum function and the + operator works different for missing values, but in this example, n is exactly not missing.
That's because it is initialized to zero, as the first answer noted. So it's never missing unless it's set explicitly to missing.
data _null_;
put n=;
if n=. then n=-5;
n+1;
put n=;
stop;
run;
Note the log contents. It starts out at *zero*.
Yes - thats the reason. Now I understand.
If the N is retained and starts with an initial value of 0 then if cannot become missing unless you set it to missing. I do not see anywhere that you are setting N to missing.
If N is NOT retained than it will start out as missing on each iteration of the data step.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.