Solved: Re: Write column values from one observation to another

taishayla · Posted 09-26-2024 12:13 PM

Hi All,

I have a dataset which looks something like the example provided below. In this example, my desired output would keep only the first observation for each ParticipantID. However, I need the values for Nurse and Team to be put into this first observation (which is currently blank). So, for this example, the output for ParticipantID=1 should have StartDate=8/1/2024, EndDate=8/16/2024, EndReason=Unknown, Nurse = A, Team=1. Any suggestions?

Ksharp · Posted 09-26-2024 09:30 PM

You'd better post your data as a DATA STEP code, nobody would like to type it for you if you need some help from others.

data have;
input id start : mmddyy10. end : mmddyy10. reason $ nurse $ team;
cards;
1 8/1/2024 8/16/2024 unknown . .
1 8/25/2024 8/30/2024 . A 1
1 9/15/2024 9/24/2024 . B 1
;

proc sort data=have out=temp;
by id descending start;
run;
data want;
 update temp(obs=0) temp;
 by id;
run;

View solution in original post

Tom · Posted 09-26-2024 12:31 PM

So you want to group by ParticipantID?

You want to result in only one observation per group?

Do you want the values of STARTDATE ENDDATE ENDREASON from the first observation only and the first non-empty value of NURSE TEAM?

If so then just merge two copies of the datasets. One without those two variables and one with just those variables and only the non-missing observations.

data want;
  merge have(drop=Nurse Team)
        have(keep=ParticipantID Nurse Team 
             where=(not missing(Nurse)))
  ;
  by participantID;
  if first.partcipantID;
run;

Could the first non empty NURSE be on a different observation than the first non empty TEAM? IF so then merge three copies of the file instead of just two.

taishayla · Posted 09-26-2024 02:52 PM

Thank you so much for your response! Yes, I want to group by ParticipantID, with: only 1 observation per ID, Start/EndDates, EndReason from the first observation, and the first non-empty values of both Nurse and Team. It is possible that the first non-empty Nurse value can be within a different observation from that of the first non-empty Team Value. So for your proposed solution, it would be to merge copies of the same dataset? I want to make sure I understand this.

Tom · Posted 09-26-2024 03:01 PM

Sure. Just replicate the pattern for finding the first non missing value so each variable is found separately.

data want;
  merge have(drop=Nurse Team)
        have(keep=ParticipantID Nurse where=(not missing(Nurse)))
        have(keep=ParticipantID Team where=(not missing(Team)))
  ;
  by participantID;
  if first.partcipantID;
run;

Try it on your data and see if you like the results.

taishayla · Posted 09-27-2024 02:33 PM

Thanks! I couldn't get this code to work, however. Log says the ParticipantID variable is uninitialized and returns a dataset with no observations.

Tom · Posted 09-27-2024 02:36 PM

@taishayla wrote:

Thanks! I couldn't get this code to work, however. Log says the ParticipantID variable is uninitialized and returns a dataset with no observations.

Make sure you know the NAMES of the variables in your dataset so that the code uses the right names.

Ksharp · Posted 09-26-2024 09:30 PM

You'd better post your data as a DATA STEP code, nobody would like to type it for you if you need some help from others.

data have;
input id start : mmddyy10. end : mmddyy10. reason $ nurse $ team;
cards;
1 8/1/2024 8/16/2024 unknown . .
1 8/25/2024 8/30/2024 . A 1
1 9/15/2024 9/24/2024 . B 1
;

proc sort data=have out=temp;
by id descending start;
run;
data want;
 update temp(obs=0) temp;
 by id;
run;

taishayla · Posted 09-27-2024 02:35 PM

Thank you, this worked!

Registration is open

SAS Training: Just a Click Away