Obsidian | Level 7

## Difference between using If versus where

Hi,

I took the SAS 9.4 Base Certification practice exam.  Below are the steps for question 13.

This project will use data set cert.input36. At any time, you may save your program as program36 in cert\programs.
Write a SAS program that will clean the data in cert.input36 as follows:

• Step 1:
• create a temporary data set, cleandata36.
• In this data set, convert all group values to upper case.
• Then keep only observations with group equal to 'A' or 'B'.
• Step 2:
• Determine the MEDIAN value for the Kilograms variable for each group (A,B) in the cleandata36 data set. Round MEDIAN to the nearest whole number.
• Step 3:
• create results.output36 from cleandata36
• Ensure that all values for variable Kilograms are between 40 and 200, inclusively.
• If the value is missing or out of range, replace the value with the MEDIAN Kilograms value for the respective group (A,B) calculated in step 2.

How many observations are in results.output36?

The original dataset had 5000 observations.

The practice exam has 4992 observations for the answer.

I can't figure out why there is a 95 observation difference.

I did answer the second question regarding the median correctly.

I used the following code:

data cleandata36;
set cert.input36;
group = upcase(group);
where group in ('A' 'B');
run;

proc means data=cleandata36 median maxdec=0;
class group;
var kilograms;
run;

data results.output36;
set cleandata36;
if Group = 'A' and kilograms lt 40 or kilograms gt 200 then kilograms = 79;
if Group = 'B' and kilograms lt 40 or kilograms gt 200 then kilograms = 89;
run;

proc means data=results.output36 maxdec= 2 min max mean median n;
class group;
var kilograms;
run;

`The answer code is:data work.cleandata36;   set cert.input36; group=upcase(group); if group in ('A','B');run;`
```proc means data=work.cleandata36 median;
class group;
var kilograms;
run;

data results.output36;
set cleandata36;
if Kilograms < 40 or Kilograms > 200 then do;
if group='A' then kilograms=79;
else kilograms=89;
end;
run;

proc contents data=results.output36;```

1 ACCEPTED SOLUTION

Accepted Solutions
Super User

## Re: Difference between using If versus where

Here is a small example of what the difference is:

```data example;
input group \$;
datalines;
a
A
b
B
;

data usewhere;
set example;
group=upcase(group);
where group in ('A' 'B');
run;
data useif;
set example;
group=upcase(group);
if group in ('A' 'B');
run;```

WHERE used the values from in input data set vector, the "upcase" is basically ignored at it would be applied after the selection.

IF uses the values of the variable as encountered.

So your code did not select the values of group from 'a' and 'b' that should be converted to 'A' and 'B' resulting in fewer observations.

8 REPLIES 8
Super User

## Re: Difference between using If versus where

Here is a small example of what the difference is:

```data example;
input group \$;
datalines;
a
A
b
B
;

data usewhere;
set example;
group=upcase(group);
where group in ('A' 'B');
run;
data useif;
set example;
group=upcase(group);
if group in ('A' 'B');
run;```

WHERE used the values from in input data set vector, the "upcase" is basically ignored at it would be applied after the selection.

IF uses the values of the variable as encountered.

So your code did not select the values of group from 'a' and 'b' that should be converted to 'A' and 'B' resulting in fewer observations.

Obsidian | Level 7

## Re: Difference between using If versus where

Thanks for the great explanation.

Obsidian | Level 7

## Re: Difference between using If versus where

and the example. I ran the code and it helps me to see it on my own.

Calcite | Level 5

## Re: Difference between using If versus where

The  output values of median we  are getting are as follows :

for group 'A' median=79,

for group 'B' median=89,

but when using this values as answer to questions in SAS practice test available on SAS website it shows as wrong answer.

for group 'A' median=76.3

for group 'B' median=86.5

Diamond | Level 26

## Re: Difference between using If versus where

@Mansi24 wrote:

The  output values of median we  are getting are as follows :

for group 'A' median=79,

for group 'B' median=89,

but when using this values as answer to questions in SAS practice test available on SAS website it shows as wrong answer.

for group 'A' median=76.3

for group 'B' median=86.5

--
Paige Miller
Super User

## Re: Difference between using If versus where

@Mansi24 wrote:

The  output values of median we  are getting are as follows :

for group 'A' median=79,

for group 'B' median=89,

but when using this values as answer to questions in SAS practice test available on SAS website it shows as wrong answer.

for group 'A' median=76.3

for group 'B' median=86.5

Without data and the actual code run there is no way to answer you question.

Since you say this on the SAS website then post the link to the page with the DATA and then show us the code you ran against that data set.

Super User

## Re: Difference between using If versus where

The difference is like the difference between the bouncer at the door of Studio 54 and host at a restaurant.

The bouncer (the WHERE) prevents you from entering the building and the host (subsetting IF) prevents you from getting to a table.

Obsidian | Level 7

## Re: Difference between using If versus where

What a great way to remember!

thanks.
Diane

Discussion stats
• 8 replies
• 870 views
• 4 likes
• 5 in conversation