BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
asgee
Obsidian | Level 7

Hi all,

 

Having some trouble with trying to specify multiple conditions to subset my data. 

 

Sample:

IDXYVisit
11223311
44556621
77889921
13131311
13131322
55555521
66666621
77777721
88888821
99999911
98765411
98765422
Note: Var "XY" means  -> X= 1 // Y = 2
 
I'm trying to isolate the data down to one row per ID. The problem are the IDs with 2 visits.
 
How do I indicate that if the ID has 2 visits (meaning the subject has two rows which have a Visit = 1 and Visit = 2), I want to only keep the row where "Visit=2" and the "XY" variable = 2 while not impacting the other ID's that only have 1 visit (i.e. no duplicates). 
 
Here's what I tried but it was causing some errors: 
data test;
set test;
if ID = (XY=2) & (Visit=2) then keep;
run;
Any help/tips would be appreciated!
 
 
 
 
 
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

It might help to show exactly what you expect for output.

 

As I understand you want I would start with something like:

Proc sort data=test;
   by id visit;
run;

data want;
   set test;
   by id;
   if last.id;
run;

Your description of XY is not very clear, especially since you do not show any X or Y values.

 

Or you should provide a data step that will recreate your data and then show what you expect for output.

View solution in original post

4 REPLIES 4
ballardw
Super User

It might help to show exactly what you expect for output.

 

As I understand you want I would start with something like:

Proc sort data=test;
   by id visit;
run;

data want;
   set test;
   by id;
   if last.id;
run;

Your description of XY is not very clear, especially since you do not show any X or Y values.

 

Or you should provide a data step that will recreate your data and then show what you expect for output.

asgee
Obsidian | Level 7

Hi @ballardw - thanks for the comment. I've sort of grabbed the sample data in my question and edited it to what I'd expect for the output. 

 

Expected Output:

IDXYVisit
11223311
44556621
77889921
13131322
55555521
66666621
77777721
88888821
99999911
98765422

 

So essentially I've deleted the first row of data for ID's 131313 and 987654 as they had multiple visits. Those would have been rows where the XY=1 and Visit = 1 for those IDs.

 

Again, my issue was that I didn't know how to specify it so that it would only delete the rows where XY=1 and Visit=1 from IDs with 2 visits, while keeping the IDs with only 1 visit (regardless of what values of "XY" and "Visit" they have). 

asgee
Obsidian | Level 7

Thanks for this solution @ballardw ! I didn't know there was a SAS function that could indicate the "last" row of the id etc. Works fine and got my expected output. 

 

The sort prior to this by id and visit really helped! 

ballardw
Super User

@asgee wrote:

Thanks for this solution @ballardw ! I didn't know there was a SAS function that could indicate the "last" row of the id etc. Works fine and got my expected output. 

 

The sort prior to this by id and visit really helped! 


If you data is grouped, especially by an order that would not be duplicated with proc sort, by some variable but not sorted the BY can  still be used if you use NOTSORTED option on the BY statement.

There is also an automatic First. that can be used to set/ reset values when processing. Each variable on the BY statement in a data step gets its own First. and Last variable setting so some pretty complicated, and admittedly confusing at first, things can be done. It may also be helpful to know that if a value only appears once it is both First and Last on the same record.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 475 views
  • 2 likes
  • 2 in conversation