Hi All,
I have a dataset which looks something like the example provided below. In this example, my desired output would keep only the first observation for each ParticipantID. However, I need the values for Nurse and Team to be put into this first observation (which is currently blank). So, for this example, the output for ParticipantID=1 should have StartDate=8/1/2024, EndDate=8/16/2024, EndReason=Unknown, Nurse = A, Team=1. Any suggestions?
You'd better post your data as a DATA STEP code, nobody would like to type it for you if you need some help from others.
data have;
input id start : mmddyy10. end : mmddyy10. reason $ nurse $ team;
cards;
1 8/1/2024 8/16/2024 unknown . .
1 8/25/2024 8/30/2024 . A 1
1 9/15/2024 9/24/2024 . B 1
;
proc sort data=have out=temp;
by id descending start;
run;
data want;
update temp(obs=0) temp;
by id;
run;
So you want to group by ParticipantID?
You want to result in only one observation per group?
Do you want the values of STARTDATE ENDDATE ENDREASON from the first observation only and the first non-empty value of NURSE TEAM?
If so then just merge two copies of the datasets. One without those two variables and one with just those variables and only the non-missing observations.
data want;
merge have(drop=Nurse Team)
have(keep=ParticipantID Nurse Team
where=(not missing(Nurse)))
;
by participantID;
if first.partcipantID;
run;
Could the first non empty NURSE be on a different observation than the first non empty TEAM? IF so then merge three copies of the file instead of just two.
Thank you so much for your response! Yes, I want to group by ParticipantID, with: only 1 observation per ID, Start/EndDates, EndReason from the first observation, and the first non-empty values of both Nurse and Team. It is possible that the first non-empty Nurse value can be within a different observation from that of the first non-empty Team Value. So for your proposed solution, it would be to merge copies of the same dataset? I want to make sure I understand this.
Sure. Just replicate the pattern for finding the first non missing value so each variable is found separately.
data want;
merge have(drop=Nurse Team)
have(keep=ParticipantID Nurse where=(not missing(Nurse)))
have(keep=ParticipantID Team where=(not missing(Team)))
;
by participantID;
if first.partcipantID;
run;
Try it on your data and see if you like the results.
Thanks! I couldn't get this code to work, however. Log says the ParticipantID variable is uninitialized and returns a dataset with no observations.
@taishayla wrote:
Thanks! I couldn't get this code to work, however. Log says the ParticipantID variable is uninitialized and returns a dataset with no observations.
Make sure you know the NAMES of the variables in your dataset so that the code uses the right names.
You'd better post your data as a DATA STEP code, nobody would like to type it for you if you need some help from others.
data have;
input id start : mmddyy10. end : mmddyy10. reason $ nurse $ team;
cards;
1 8/1/2024 8/16/2024 unknown . .
1 8/25/2024 8/30/2024 . A 1
1 9/15/2024 9/24/2024 . B 1
;
proc sort data=have out=temp;
by id descending start;
run;
data want;
update temp(obs=0) temp;
by id;
run;
Thank you, this worked!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.