Subsetting data using "If" statements

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 13
Accepted Solution

Subsetting data using "If" statements

If I am trying to subset my data to include all femles that have variable A <5 OR variable B >=10 OR variable C = "blue," can I accomplish this correctly using two separate "if" statements? That is:

 

data mydata;

set path01.mydata;

if gender = "F";

drop variable D;

if variable A <5 or variable B >=10 or variable C = "blue";

run;

 

When I run a print procedure on this program, it correctly only prints females, but it doesn't seem to be keeping all data points that meet any one of the three variable criteria, but seems to keep most of them and doesn't produce any error?


Accepted Solutions
Solution
‎11-27-2016 05:08 PM
Super User
Posts: 5,497

Re: Subsetting data using "If" statements

The most likely culprit is C.  Character values are case sensitive, so all of these values would be different:

 

blue

Blue

BLUE

 

You could always change the third check to be:

 

or upcase(C) = 'BLUE'

View solution in original post


All Replies
Super User
Posts: 19,770

Re: Subsetting data using "If" statements

Your code looks correct. Post a sample of records that you think should be included but aren't or vice versa. 

Solution
‎11-27-2016 05:08 PM
Super User
Posts: 5,497

Re: Subsetting data using "If" statements

The most likely culprit is C.  Character values are case sensitive, so all of these values would be different:

 

blue

Blue

BLUE

 

You could always change the third check to be:

 

or upcase(C) = 'BLUE'

Super User
Posts: 5,424

Re: Subsetting data using "If" statements

Side note: since you are not relying on calculated variables you should use WHERE instead, since it's more efficient.
Data never sleeps
Super User
Super User
Posts: 7,942

Re: Subsetting data using "If" statements

At a glance:

data mydata;
  set path01.mydata (where=(upcase(gender="F") and upcase(c)="BLUE" and (a < 5 or b > 10)));
run;
Super User
Posts: 11,343

Re: Subsetting data using "If" statements

Question about the "A < 5" requirement: Do you also want missing values for A? Missing is "less than any value" as treated by SAS. If you do not want missing values for A then you will need to provide either  something like: (not missing(A) And a<5) or provide a lower bound of acceptable values such as   0 le 5 lt 5

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 232 views
  • 0 likes
  • 6 in conversation