Hello, I am a student in a SAS programming class and I was given this problem to examine the difference between the 'WHERE' and 'IF' statements. I am only allowed to edit the datalines in hopes of accomplishing the following guidelines:
Replace the gaps marked by triangles ∆1 and ∆2 with lines of data so that:
● the data sets DS1 and DS2 each have exactly four observations and no missing values;
● the data set DS_Where has exactly one observation for each value w = 1, 2, 3, 4, 5, which appear in that order.
● Design your data sets so that the program’s output shows the following properties:
● the w = 1 observations are identical in DS_Where and DS_If, with no missing values;
● the w = 2 observations are identical in DS_Where and DS_If, with a missing value for y;
● the w = 3 observations are identical in DS_Where and DS_If, with a missing value for z;
● DS_Where has a w = 4 observation with at least one missing value, but the w = 4 observation in DS_If has no missing values;
● DS_If has no observation with w = 5. A copy of the program (where_if.sas) can be downloaded with this assignment on Canvas. You may not change any aspect of the program except for inserting data as specified above. Please turn in the printed output on a single page along with a snippet of your data.
The numbers in the datalines are my current input which yields a 'DS_WHERE' table with the correct output satisfying all the guidelines however, my 'DS_IF' table is missing the values for which 'DS2' has an x<5 and I do not understand how I get the missing value numbers for the 'DS_IF' table. Any help or explanation would be greatly appreciated.
data DS1;
input w x y;
datalines;
1 1 1
2 5 2
3 3 3
4 5 4
5 5 5
;
run;
data DS2;
input w x z;
datalines;
1 1 1
2 2 2
3 5 3
4 4 4
5 5 5
;
run;
data DS_Where;
merge DS1 DS2;
by w;
where x<5;
run;
data DS_If;
merge DS1 DS2;
by w;
if x<5;
run;
proc print data=DS1 noobs;
title "First data set";
run;
proc print data=DS2 noobs;
title "Second data set";
run;
proc print data=DS_Where noobs;
title "Merged with WHERE";
run;
proc print data=DS_If noobs;
title "Merged with IF";
run;
SAS Output
Merged with WHERE |
1 | 1 | 1 | 1 |
2 | 2 | . | 2 |
3 | 3 | 3 | . |
4 | 4 | . | 4 |
Merged with IF |
1 | 1 | 1 | 1 |
2 | 2 | 2 | 2 |
4 | 4 | 4 | 4 |
The WHERE statement filters the records that are retrieve from the input datasets.
The IF statement is applied to the current data that is in the data step, so it is applied to the data after it has already been merged.
Make the two subsets first and then merge them to see whey the WHERE dataset has missing values.
Also make the merge without either the WHERE or IF statement and then run a separate step to filter.
Search lexjansen.com and you’ll find many papers written on this in detail.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.