BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
melc
Calcite | Level 5

A question in the SAS Certification Practice Programming Performance-Based Exam, requires the use of the upcase function under the condition of subsetting the data set with the values of 'A' or 'B' from the variable called group. A part of the question is to find the number of observations that result from the program. I have listed some programs below and I would like to know why program #3 does not give the same results as #1, #2, and #4.

 

  1. The answer that SAS provided which results in 4992 observations:
    data cleandata;
    set cert.input36;
    if upcase(group) in('A', 'B');
    run;
  2. It appears that instead of using 'if', 'where' can provide the same result of 4992 observations:
    data cleandata;
    set cert.input36;
    where upcase(group) in('A', 'B');run;
  3. I thought the following program would result in the correct answer but it doesn't. Why does the following not provide the same results as the two programs above? This has fewer observations  than the two programs above. The following results in 4897 observations.
    data cleandata;
    set cert.input36;
    group=upcase(group);
    where group in ('A', 'B');
    run;
  4. The following program results in the same number of observations (4992 observations) as the first two programs but it seems to defeat the purpose of using the upcase function because the upcase function can be removed to give the same result

data cleandata;
set cert.input36;
group=upcase(group);
where group in ('A', 'B', 'a', 'b');
run;

 

OR remove the upcase function since listing all of the possible values with the in operator  (4992 observations)

 

data cleandata;
set cert.input36;
where group in ('A', 'B', 'a', 'b');
run;

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @melc and welcome to the SAS Support Communities!

 

The WHERE statement filters the incoming data immediately (see the note in the log saying "There were ... observations read from data set ... WHERE ..." and also the documentation), i.e., other DATA step statements such as the assignment statement group=... in the third and fourth program already operate on the subset selected by the WHERE condition (using the GROUP values found in the input dataset), regardless of their position in the DATA step code. Hence, the UPCASE function call in programs 3 and 4 comes too late for the WHERE statement and affects only the outgoing data written to dataset CLEANDATA.

View solution in original post

2 REPLIES 2
FreelanceReinh
Jade | Level 19

Hello @melc and welcome to the SAS Support Communities!

 

The WHERE statement filters the incoming data immediately (see the note in the log saying "There were ... observations read from data set ... WHERE ..." and also the documentation), i.e., other DATA step statements such as the assignment statement group=... in the third and fourth program already operate on the subset selected by the WHERE condition (using the GROUP values found in the input dataset), regardless of their position in the DATA step code. Hence, the UPCASE function call in programs 3 and 4 comes too late for the WHERE statement and affects only the outgoing data written to dataset CLEANDATA.

melc
Calcite | Level 5

I understand now. Thanks!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 655 views
  • 1 like
  • 2 in conversation