DATA Step, Macro, Functions and more

DO block ignoring IF statement

Accepted Solution Solved
Reply
Contributor
Posts: 24
Accepted Solution

DO block ignoring IF statement

I'm having an odd problem working with some data at work. I have the following code

data compare;set set1 (in=a) set2 (in=b);
	if b=1 then do;

		if points in (0 1) then newcat=1;
		if points in (2 3) then newcat=2;
		if points in (4 5 6) then newcat=3;
		if points in (7 8 9) then newcat=4;
		if points in (10 11 12) then newcat=5;
		if points>=13 then newcat=6;

		if source=1 and applied=1 and gdlinehi ne '' and table not in (1 2 3) then do;
			final=career+carmod;
			guide=level+modifier;
		end;
	end;
run;

Which seems fairly straightforward. However, when I run a check of the final and guide variables, I find that they exist for all of the observations in the data set, rather than the somewhat small subset that should have been selected by the IF statement. Through some testing, I've discovered that if I remove the first data set from the SET statement (so it reads as set set2 (in=b); ) the IF restrictions seem to work just fine. There's no crossover between the two data sets; I've checked via both PROC COMPARE and using the IN= variables.

 

Any idea what's going on here?


Accepted Solutions
Solution
‎11-23-2015 01:53 PM
Super User
Posts: 5,082

Re: DO block ignoring IF statement

The symptoms yiou describe would occur of FINAL and GUIDE were part of SET1.

 

One easy solution would be to add just before the end of the DO block:

 

else do;

   final=.;

   guide=.;

end;

View solution in original post


All Replies
Frequent Contributor
Posts: 130

Re: DO block ignoring IF statement

Are you trying to append set1 and set2, or merge them?  Right now it seems you're trying to append the data sets.  I believe you need to modify your set statement to a merge statement if you are trying to merge the two data sets.  Also, just a suggestion to increase efficiency so you aren't reading every row of data with each of the if statements, but I would consider nesting them (for example change if then; if then to if then; else if then; elseSmiley Wink.  Other than that, I don't see why the if statements in the do statement would be having an issue with the code you've provided.

Super User
Posts: 10,500

Re: DO block ignoring IF statement

Are you saying you expect some VARIABLES not to exist for some records? Or that you have values for records you don't expect?

 

SAS Super FREQ
Posts: 3,476

Re: DO block ignoring IF statement

What do you mean when you say that the FINAL and GUIDE variables "exist for all of the observations in the data set"?

Your code will create new variables FINAL and GUIDE.  For some observations you are setting values for these variables, for others you are not.  But the variables exist for all observations.

 

For observations in the first data set, the FINAL and GUIDE variables will have missing values. Observations in the second data set will be assigned a value or will be missing.

 

What did you expect to happen?

Contributor
Posts: 24

Re: DO block ignoring IF statement

"For observations in the first data set, the FINAL and GUIDE variables will have missing values. Observations in the second data set will be assigned a value or will be missing."

 

This is what should be happening, yes. However, I find that FINAL and GUIDE have a non-missing value for every observation from the second data set, rather than the subset indicated by the IF statement.

Super User
Posts: 5,082

Re: DO block ignoring IF statement

As Rick mentioned, every variable exists on every observation.  That is always true of every SAS data set.  However, it is possible for a variable to have a missing value on some observations.

 

However ... if you are saying that some observations from SET1 have nonmissing values for FINAL and GUIDE ... that is possible but tricky.  It could happen if SET1 contains a variable named B.  Then you might encounter B=1 for some observations from SET1.

Contributor
Posts: 24

Re: DO block ignoring IF statement

"However ... if you are saying that some observations from SET1 have nonmissing values for FINAL and GUIDE ... that is possible but tricky.  It could happen if SET1 contains a variable named B.  Then you might encounter B=1 for some observations from SET1."

It's more that all 75,000+ observations from SET2 have a non-missing value for FINAL and GUIDE, rather than the 2,269 that fit the IF statement criteria.

Super User
Posts: 5,082

Re: DO block ignoring IF statement

Well, the obvious (probably wrong) possibility is that FINAL and GUIDE are part of SET2 originally.  The tricky (and probably correct) answer is that FINAL and GUIDE are part of SET1.  All variables that come from a SAS data set are automatically retained (whether they come from SET1 or SET2).  So when B=1 you would be replacing FINAL and GUIDE, but when B=(anything else) you would see the same value repeating from the most recent observation that had B=1.

SAS Super FREQ
Posts: 3,476

Re: DO block ignoring IF statement

Another guess: There could be statements in your code that you are not showing us and that are important.  For example, if your DATA step has a RETAIN statement such as

    retain final guide;

then that would explain the behaviour that you report.

Contributor
Posts: 24

Re: DO block ignoring IF statement

Nope! I originally had more going on, but stripped it down to just what you see here before coming to the forums for help.

Contributor
Posts: 24

Re: DO block ignoring IF statement

The FINAL and GUIDE variables don't exist on SET2 (which is why if I run this code on a data step with SET2, this doesn't happen). As mentioned in the original post, there is no overlap between the two data sets (they're from different years and have entirely different sets of observations). Also, as a test, I added in lines setting FINAL and GUIDE equal to missing at the beginning of the DO block (to see if there was some sort of unexpected carryover from observation to observation). This did not help.

Solution
‎11-23-2015 01:53 PM
Super User
Posts: 5,082

Re: DO block ignoring IF statement

The symptoms yiou describe would occur of FINAL and GUIDE were part of SET1.

 

One easy solution would be to add just before the end of the DO block:

 

else do;

   final=.;

   guide=.;

end;

Trusted Advisor
Posts: 1,115

Re: DO block ignoring IF statement

Please note that "setting FINAL and GUIDE equal to missing at the beginning of the DO block" is useless, because:

  • If the IF condition is met, FINAL and GUIDE are overwritten with non-missing values immediately.
  • If it is not met, FINAL and GUIDE may very well retain their non-missing values from the last observation where the condition was met, as @Astounding has pointed out. The important question (raised by @Astounding) is: Does SET1 contain variables named FINAL and GUIDE?

Last week there was a discussion about exactly this type of issue. Please see this thread.

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 402 views
  • 4 likes
  • 6 in conversation