BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JacobSimonsen
Barite | Level 11

I just became surprised about how it works when one increment a variable without using "=". The strange thing happens when I have missing values. In that case it seems that it take the last value that was not missing (from a previous observation). And this even happens if I first replace a possible missing value with some other number.

 

I expected that the two dataset "outdata1" and "outdata2" below would be the same, but they became very different. Only difference is that "n=n+1" is replaced by "n+1". I understand what happens but not the logic. It seems not intuitive.

 

 

 

data abc;
do i=1 to 10;
  text=ifc( 5<=i<=8,'NR','  ');
  output;
end;
keep text ;
run;

data outdata1;
  set abc;
  if text ne 'NR' then n=5;
  if n=. then n=-5;
  n+1;
run;

data outdata2;
  set abc;
  if text ne 'NR' then n=5;
  if n=. then n=-5;
  n=n+1;
run;
proc print data=outdata1 ;
run;
proc print data=outdata2 ;
run;

 

And the two dataset becomes

Obs text n
1   6
2   6
3   6
4   6
5 NR 7
6 NR 8
7 NR 9
8 NR 10
9   6
10  

6

and

Obs text n
1   6
2   6
3   6
4   6
5 NR -4
6 NR -4
7 NR -4
8 NR -4
9   6
10   6
1 ACCEPTED SOLUTION

Accepted Solutions
snoopy369
Barite | Level 11

That's because it is initialized to zero, as the first answer noted.  So it's never missing unless it's set explicitly to missing.

 

data _null_;
  put n=;
  if n=. then n=-5;
  n+1;
  put n=;
  stop;
run;

Note the log contents.  It starts out at *zero*.

View solution in original post

7 REPLIES 7
Astounding
PROC Star

It sounds like the software is working as expected.  Here's some of the documentation:

 

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000289454.htm

 

Basically, this statement has triple meaning:

 

varname + (some numeric value);

 

It means:

 

  1. Retain VARNAME
  2. Give VARNAME an initial value of zero (unless a RETAIN statement overrides that initial value)
  3. Increment VARNAME.  However, if the value to the right of the plus sign is missing, ignore it.

That sounds like the result you are getting, so it's to be expected.

snoopy369
Barite | Level 11

+ is the Sum Operator; if you think of it that way, you will remember half of the reason for this at least!

 

+ does two things:

 

* Retain the variable on the left side of the operator

* Sum the two sides of the operator, just as the SUM function does

 

Remember, the SUM function considers missing values to be zero when added to a nonmissing value.  This operator works the same way.

Tom
Super User Tom
Super User

 

N+1;

Is an example of a SUM statement. The SUM statement takes the form

varname+expression;

The SUM statement will mark the target variable to be retained and will initialize it to 0. When it executes it is as if you used this statement instead

varname=sum(varname,expression);

The SUM function and the + operator work differently when one or more of the input is missing. The + operation will result in a missing value if either of the inputs are missing. The SUM() function will ignore missing values.  So the result will only be missing if all of the inputs are missing.

Instead of replacing 'n+1' with 'n=n+1' you could have replaced it with these two statements.

retain n 0;
n=sum(n,1);
JacobSimonsen
Barite | Level 11

Thank you for your answers - I think they are all right.

But, why is it that when I replace the missing value in "n" with -5, then it instead used the retained value from last observation.

I see that the sum function and the + operator works different for missing values, but in this example, n is exactly not missing.

snoopy369
Barite | Level 11

That's because it is initialized to zero, as the first answer noted.  So it's never missing unless it's set explicitly to missing.

 

data _null_;
  put n=;
  if n=. then n=-5;
  n+1;
  put n=;
  stop;
run;

Note the log contents.  It starts out at *zero*.

JacobSimonsen
Barite | Level 11

Yes - thats the reason. Now I understand.

 

Tom
Super User Tom
Super User

If the N is retained and starts with an initial value of 0 then if cannot become missing unless you set it to missing. I do not see anywhere that you are setting N to missing. 

 

If N is NOT retained than it will start out as missing on each iteration of the data step.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 4198 views
  • 2 likes
  • 4 in conversation