I have the following dataset
patientID | Start_1_datetime | start_2_date_time |
1234 | 12/1/2019 1:34 | 12/1/2019 2:45 |
1234 | 12/2/2019 0:39 | 12/1/2019 2:45 |
1234 | 12/4/2019 4:32 | 12/1/2019 2:45 |
What I would like to do is that for the first patient sorted by start_1_date, I would like to create a new start_date_time and set to it start_2_date_time for first instance only. for other rows, use start_1_date_time
patientID | Start_1_datetime | start_2_date_time | new_start_date |
1234 | 12/1/2019 1:34 | 12/1/2019 2:45 | 12/1/2019 2:45 |
1234 | 12/2/2019 0:39 | 12/1/2019 2:45 | 12/2/2019 0:39 |
1234 | 12/4/2019 4:32 | 12/1/2019 2:45 | 12/4/2019 4:32 |
I did:
data a.test_2; set a.test_1;
if first.patientid then new_start_date_time = start_2_date_time;
else new_start_date_time = Start_1_datetime ;
format new_start_date_time datetime23.;run;
Didn't work. it uses start_1_datetime for all
I create character values for simplicity, but the principle is the same
data have;
infile datalines dlm=',';
input (patientID Start_1_datetime start_2_date_time)(:$20.);
datalines;
1234,12/1/2019 1:34,12/1/2019 2:45
1234,12/2/2019 0:39,12/1/2019 2:45
1234,12/4/2019 4:32,12/1/2019 2:45
;
data want;
set have;
by patientID;
if first.patientID then new_start_datetime=start_2_date_time;
else new_start_datetime=Start_1_datetime;
run;
I create character values for simplicity, but the principle is the same
data have;
infile datalines dlm=',';
input (patientID Start_1_datetime start_2_date_time)(:$20.);
datalines;
1234,12/1/2019 1:34,12/1/2019 2:45
1234,12/2/2019 0:39,12/1/2019 2:45
1234,12/4/2019 4:32,12/1/2019 2:45
;
data want;
set have;
by patientID;
if first.patientID then new_start_datetime=start_2_date_time;
else new_start_datetime=Start_1_datetime;
run;
Yes, that's one place where SAS is a little weak (methinks!). If you look closely in the log you may have seen a note saying something like "variable first.patientid is uninitialized". I think SAS should throw an error, or at least a warning, in these cases where you use first. or last. without a BY statement.
Because the problem seems to be that you forgot to write "by patiendid;". Try that, I think that is the problem.
data have;
input patientID Start_1_datetime $19. start_2_date_time $19.;
cards;
1234 12/1/2019 1:34 12/1/2019 2:45
1234 12/2/2019 0:39 12/1/2019 2:45
1234 12/4/2019 4:32 12/1/2019 2:45
;
data want;
set have;
by patientid;
if first.patientid then new_start_date=start_2_date_time;
else new_start_date=Start_1_datetime;
run;
Almost there:
data a.test_2; set a.test_1; BY PATIENTID; if first.patientid then new_start_date_time = start_2_date_time; else new_start_date_time = Start_1_datetime ; format new_start_date_time datetime23.; run;
To use the First. and Last. you must have the variable on a BY statement.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.