Hi all,
Having some trouble with trying to specify multiple conditions to subset my data.
Sample:
ID | XY | Visit |
112233 | 1 | 1 |
445566 | 2 | 1 |
778899 | 2 | 1 |
131313 | 1 | 1 |
131313 | 2 | 2 |
555555 | 2 | 1 |
666666 | 2 | 1 |
777777 | 2 | 1 |
888888 | 2 | 1 |
999999 | 1 | 1 |
987654 | 1 | 1 |
987654 | 2 | 2 |
data test;
set test;
if ID = (XY=2) & (Visit=2) then keep;
run;
It might help to show exactly what you expect for output.
As I understand you want I would start with something like:
Proc sort data=test; by id visit; run; data want; set test; by id; if last.id; run;
Your description of XY is not very clear, especially since you do not show any X or Y values.
Or you should provide a data step that will recreate your data and then show what you expect for output.
It might help to show exactly what you expect for output.
As I understand you want I would start with something like:
Proc sort data=test; by id visit; run; data want; set test; by id; if last.id; run;
Your description of XY is not very clear, especially since you do not show any X or Y values.
Or you should provide a data step that will recreate your data and then show what you expect for output.
Hi @ballardw - thanks for the comment. I've sort of grabbed the sample data in my question and edited it to what I'd expect for the output.
Expected Output:
ID | XY | Visit |
112233 | 1 | 1 |
445566 | 2 | 1 |
778899 | 2 | 1 |
131313 | 2 | 2 |
555555 | 2 | 1 |
666666 | 2 | 1 |
777777 | 2 | 1 |
888888 | 2 | 1 |
999999 | 1 | 1 |
987654 | 2 | 2 |
So essentially I've deleted the first row of data for ID's 131313 and 987654 as they had multiple visits. Those would have been rows where the XY=1 and Visit = 1 for those IDs.
Again, my issue was that I didn't know how to specify it so that it would only delete the rows where XY=1 and Visit=1 from IDs with 2 visits, while keeping the IDs with only 1 visit (regardless of what values of "XY" and "Visit" they have).
Thanks for this solution @ballardw ! I didn't know there was a SAS function that could indicate the "last" row of the id etc. Works fine and got my expected output.
The sort prior to this by id and visit really helped!
@asgee wrote:
Thanks for this solution @ballardw ! I didn't know there was a SAS function that could indicate the "last" row of the id etc. Works fine and got my expected output.
The sort prior to this by id and visit really helped!
If you data is grouped, especially by an order that would not be duplicated with proc sort, by some variable but not sorted the BY can still be used if you use NOTSORTED option on the BY statement.
There is also an automatic First. that can be used to set/ reset values when processing. Each variable on the BY statement in a data step gets its own First. and Last variable setting so some pretty complicated, and admittedly confusing at first, things can be done. It may also be helpful to know that if a value only appears once it is both First and Last on the same record.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.