Hello,
I would like to identify the first occurrence of an intervention for 4 different interventions. In other words, of the 4 interventions which one occurred first. I have included sample variables and code. thanks!
sample data set:
patient1 intervention1 int1_date intervention2 int2_date intervention3 int3_date intervention4 int4_date
patient2 ............
data test2;
set test;
by ID;
if (int1_date < int2_date and int1_date < int3_date and int1_date < int4_date) then int =1;
if (int2_date < int1_date and int2_date < int3_date and int2_date < int4_date) then int =2;
if (int3_date < int1_date and int3_date <int2_date and int3_date < int4_date) then int =3;
if (int4_date< int1_date and int4_date <int2_date and int4_date < int3_date) then int =5;
else int = 5;
run;
Transpose to a long dataset, sort by patient and date, and then extract the information at first.patient from _name_.
Maxim 19: Long Beats Wide.
Arrays, and WHICHN() are the key items here.
data test2;
set test1;
array _dates(*) int1_date int2_date ....;
array _ints(*) intervention1-intervention4;
*find the minimum date - ignores missing;
min_date = min(of _dates(*));
*what about ties - find index of lowest date;
index_min_date = whichn(min_date, of _dates(*));
*get relevant intervention based on index;
first_intervention = _ints(index_min_date);
run;
Here's a tutorial on using Arrays in SAS
https://stats.idre.ucla.edu/sas/seminars/sas-arrays/
@cschmidt wrote:
Hello,
I would like to identify the first occurrence of an intervention for 4 different interventions. In other words, of the 4 interventions which one occurred first. I have included sample variables and code. thanks!
sample data set:
patient1 intervention1 int1_date intervention2 int2_date intervention3 int3_date intervention4 int4_date
patient2 ............
data test2;
set test;
by ID;
if (int1_date < int2_date and int1_date < int3_date and int1_date < int4_date) then int =1;
if (int2_date < int1_date and int2_date < int3_date and int2_date < int4_date) then int =2;
if (int3_date < int1_date and int3_date <int2_date and int3_date < int4_date) then int =3;
if (int4_date< int1_date and int4_date <int2_date and int4_date < int3_date) then int =5;
else int = 5;
run;
One could use your approach but much more simply:
data test2;
set test;
if int_date1=min(of int_date:) then int=1; else
if int_date2=min(of int_date:) then int=2; else
if int_date3=min(of int_date:) then int=3; else
if int_date4=min(of int_date:) then int=4;
run;
Or more generally:
data test2;
set test;
int=whichn(min(of int_datet:),of min_date:);
run;
The latter assumes your variables are named, int_date1, int_date2, ... int_date4, in that order.
The nice thing about the latter is that it accommodates any number of consecutively named INT_DATE variables.
One issue, for both techniques. If there is a tie for minimum(earliest) date, then this always chooses the leftmost.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.