A question in the SAS Certification Practice Programming Performance-Based Exam, requires the use of the upcase function under the condition of subsetting the data set with the values of 'A' or 'B' from the variable called group. A part of the question is to find the number of observations that result from the program. I have listed some programs below and I would like to know why program #3 does not give the same results as #1, #2, and #4.
data cleandata;
set cert.input36;
group=upcase(group);
where group in ('A', 'B', 'a', 'b');
run;
OR remove the upcase function since listing all of the possible values with the in operator (4992 observations)
data cleandata;
set cert.input36;
where group in ('A', 'B', 'a', 'b');
run;
Hello @melc and welcome to the SAS Support Communities!
The WHERE statement filters the incoming data immediately (see the note in the log saying "There were ... observations read from data set ... WHERE ..." and also the documentation), i.e., other DATA step statements such as the assignment statement group=... in the third and fourth program already operate on the subset selected by the WHERE condition (using the GROUP values found in the input dataset), regardless of their position in the DATA step code. Hence, the UPCASE function call in programs 3 and 4 comes too late for the WHERE statement and affects only the outgoing data written to dataset CLEANDATA.
Hello @melc and welcome to the SAS Support Communities!
The WHERE statement filters the incoming data immediately (see the note in the log saying "There were ... observations read from data set ... WHERE ..." and also the documentation), i.e., other DATA step statements such as the assignment statement group=... in the third and fourth program already operate on the subset selected by the WHERE condition (using the GROUP values found in the input dataset), regardless of their position in the DATA step code. Hence, the UPCASE function call in programs 3 and 4 comes too late for the WHERE statement and affects only the outgoing data written to dataset CLEANDATA.
I understand now. Thanks!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.