DATA Step, Macro, Functions and more

what happens when one increment a variable without "="

Accepted Solution Solved
Reply
Super Contributor
Posts: 298
Accepted Solution

what happens when one increment a variable without "="

I just became surprised about how it works when one increment a variable without using "=". The strange thing happens when I have missing values. In that case it seems that it take the last value that was not missing (from a previous observation). And this even happens if I first replace a possible missing value with some other number.

 

I expected that the two dataset "outdata1" and "outdata2" below would be the same, but they became very different. Only difference is that "n=n+1" is replaced by "n+1". I understand what happens but not the logic. It seems not intuitive.

 

 

 

data abc;
do i=1 to 10;
  text=ifc( 5<=i<=8,'NR','  ');
  output;
end;
keep text ;
run;

data outdata1;
  set abc;
  if text ne 'NR' then n=5;
  if n=. then n=-5;
  n+1;
run;

data outdata2;
  set abc;
  if text ne 'NR' then n=5;
  if n=. then n=-5;
  n=n+1;
run;
proc print data=outdata1 ;
run;
proc print data=outdata2 ;
run;

 

And the two dataset becomes

Obs text n
1   6
2   6
3   6
4   6
5 NR 7
6 NR 8
7 NR 9
8 NR 10
9   6
10  

6

and

Obs text n
1   6
2   6
3   6
4   6
5 NR -4
6 NR -4
7 NR -4
8 NR -4
9   6
10   6

Accepted Solutions
Solution
‎10-06-2017 02:55 PM
Super Contributor
Posts: 253

Re: what happens when one increment a variable without "="

Posted in reply to JacobSimonsen

That's because it is initialized to zero, as the first answer noted.  So it's never missing unless it's set explicitly to missing.

 

data _null_;
  put n=;
  if n=. then n=-5;
  n+1;
  put n=;
  stop;
run;

Note the log contents.  It starts out at *zero*.

View solution in original post


All Replies
Super User
Posts: 5,509

Re: what happens when one increment a variable without "="

Posted in reply to JacobSimonsen

It sounds like the software is working as expected.  Here's some of the documentation:

 

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000289454.htm

 

Basically, this statement has triple meaning:

 

varname + (some numeric value);

 

It means:

 

  1. Retain VARNAME
  2. Give VARNAME an initial value of zero (unless a RETAIN statement overrides that initial value)
  3. Increment VARNAME.  However, if the value to the right of the plus sign is missing, ignore it.

That sounds like the result you are getting, so it's to be expected.

Super Contributor
Posts: 253

Re: what happens when one increment a variable without "="

Posted in reply to JacobSimonsen

+ is the Sum Operator; if you think of it that way, you will remember half of the reason for this at least!

 

+ does two things:

 

* Retain the variable on the left side of the operator

* Sum the two sides of the operator, just as the SUM function does

 

Remember, the SUM function considers missing values to be zero when added to a nonmissing value.  This operator works the same way.

Super User
Super User
Posts: 7,050

Re: what happens when one increment a variable without "="

Posted in reply to JacobSimonsen

 

N+1;

Is an example of a SUM statement. The SUM statement takes the form

varname+expression;

The SUM statement will mark the target variable to be retained and will initialize it to 0. When it executes it is as if you used this statement instead

varname=sum(varname,expression);

The SUM function and the + operator work differently when one or more of the input is missing. The + operation will result in a missing value if either of the inputs are missing. The SUM() function will ignore missing values.  So the result will only be missing if all of the inputs are missing.

Instead of replacing 'n+1' with 'n=n+1' you could have replaced it with these two statements.

retain n 0;
n=sum(n,1);
Super Contributor
Posts: 298

Re: what happens when one increment a variable without "="

Thank you for your answers - I think they are all right.

But, why is it that when I replace the missing value in "n" with -5, then it instead used the retained value from last observation.

I see that the sum function and the + operator works different for missing values, but in this example, n is exactly not missing.

Solution
‎10-06-2017 02:55 PM
Super Contributor
Posts: 253

Re: what happens when one increment a variable without "="

Posted in reply to JacobSimonsen

That's because it is initialized to zero, as the first answer noted.  So it's never missing unless it's set explicitly to missing.

 

data _null_;
  put n=;
  if n=. then n=-5;
  n+1;
  put n=;
  stop;
run;

Note the log contents.  It starts out at *zero*.

Super Contributor
Posts: 298

Re: what happens when one increment a variable without "="

Posted in reply to snoopy369

Yes - thats the reason. Now I understand.

 

Super User
Super User
Posts: 7,050

Re: what happens when one increment a variable without "="

[ Edited ]
Posted in reply to JacobSimonsen

If the N is retained and starts with an initial value of 0 then if cannot become missing unless you set it to missing. I do not see anywhere that you are setting N to missing. 

 

If N is NOT retained than it will start out as missing on each iteration of the data step.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 150 views
  • 2 likes
  • 4 in conversation