I have the following dataset
| patientID | Start_1_datetime | start_2_date_time |
| 1234 | 12/1/2019 1:34 | 12/1/2019 2:45 |
| 1234 | 12/2/2019 0:39 | 12/1/2019 2:45 |
| 1234 | 12/4/2019 4:32 | 12/1/2019 2:45 |
What I would like to do is that for the first patient sorted by start_1_date, I would like to create a new start_date_time and set to it start_2_date_time for first instance only. for other rows, use start_1_date_time
| patientID | Start_1_datetime | start_2_date_time | new_start_date |
| 1234 | 12/1/2019 1:34 | 12/1/2019 2:45 | 12/1/2019 2:45 |
| 1234 | 12/2/2019 0:39 | 12/1/2019 2:45 | 12/2/2019 0:39 |
| 1234 | 12/4/2019 4:32 | 12/1/2019 2:45 | 12/4/2019 4:32 |
I did:
data a.test_2; set a.test_1;
if first.patientid then new_start_date_time = start_2_date_time;
else new_start_date_time = Start_1_datetime ;
format new_start_date_time datetime23.;run;
Didn't work. it uses start_1_datetime for all
I create character values for simplicity, but the principle is the same
data have;
infile datalines dlm=',';
input (patientID Start_1_datetime start_2_date_time)(:$20.);
datalines;
1234,12/1/2019 1:34,12/1/2019 2:45
1234,12/2/2019 0:39,12/1/2019 2:45
1234,12/4/2019 4:32,12/1/2019 2:45
;
data want;
set have;
by patientID;
if first.patientID then new_start_datetime=start_2_date_time;
else new_start_datetime=Start_1_datetime;
run;
I create character values for simplicity, but the principle is the same
data have;
infile datalines dlm=',';
input (patientID Start_1_datetime start_2_date_time)(:$20.);
datalines;
1234,12/1/2019 1:34,12/1/2019 2:45
1234,12/2/2019 0:39,12/1/2019 2:45
1234,12/4/2019 4:32,12/1/2019 2:45
;
data want;
set have;
by patientID;
if first.patientID then new_start_datetime=start_2_date_time;
else new_start_datetime=Start_1_datetime;
run;
Yes, that's one place where SAS is a little weak (methinks!). If you look closely in the log you may have seen a note saying something like "variable first.patientid is uninitialized". I think SAS should throw an error, or at least a warning, in these cases where you use first. or last. without a BY statement.
Because the problem seems to be that you forgot to write "by patiendid;". Try that, I think that is the problem.
data have;
input patientID Start_1_datetime $19. start_2_date_time $19.;
cards;
1234 12/1/2019 1:34 12/1/2019 2:45
1234 12/2/2019 0:39 12/1/2019 2:45
1234 12/4/2019 4:32 12/1/2019 2:45
;
data want;
set have;
by patientid;
if first.patientid then new_start_date=start_2_date_time;
else new_start_date=Start_1_datetime;
run;
Almost there:
data a.test_2; set a.test_1; BY PATIENTID; if first.patientid then new_start_date_time = start_2_date_time; else new_start_date_time = Start_1_datetime ; format new_start_date_time datetime23.; run;
To use the First. and Last. you must have the variable on a BY statement.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.