09-05-2017 10:24 AM
Sorry to bother you for a question for selecting of first observation based on sorting by several variables. The dataset was sorted by ID, diagnosis and date of diagnosis. I would like to select the first observation with the first diagnosed date but the same diagnosis.
I tried to use "first.ID and first.date" or "first.ID and first.date and first.diagnosis"but it seems not correct. I attached the test dataset. The records that I would like to select were highlighted in yellow.
Many thanks in advance!
09-05-2017 10:29 AM - edited 09-05-2017 10:31 AM
Sorry, I can't download Excel files, post test data as a datastep in the post.
The principal should be sound, sort the data in the order you need, then last.<lowest of by group>.
For example to get last date in each diagnosis:
proc sort data=have; by id diagnosis date; run; data want; set have; by id diagnosis date; if last.diagnosis then output; run;
09-05-2017 10:29 AM
If you want the first time this DX appeared for this ID then sort BY ID DX DATE and select the records where FIRST.DX is true.
If you want the first DX for this ID then sort BY ID DATE and take FIRST.ID.
09-12-2017 10:25 AM
I'm glad you found some useful info, Sisiwater! If one of the replies was the exact solution to your problem, can you "Accept it as a solution"? Or if one was particularly helpful, feel free to "Like" it. This will help other community members who may run into the same issue know what worked.
By the way, I deleted one of your replies that was duplicate. Maybe the first one didn't come through right away when you replied via email.