Looking to see if a certain condition is met two time points before the time point that treatment is started. If the condition is not meet two time points before treatment began, the participant will be excluded from the dataset. The variable "date" shows the time point that treatment is begun (date = 8 means the participant began treatment at the 8th time point [t8]). t1-t12 show whether some criterion is met (0=not met, 1=met) at each time point. I've included an example of what my data looks like below, but my dataset has over 200 time points instead of 12 and thousands of records. So I'm looking for an efficient way of doing this.
data example;
input ID date t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12;
cards;
1 8 0 0 0 1 1 1 1 1 1 1 1 1
2 6 1 1 1 1 1 1 1 1 1 1 1 1
3 7 0 0 0 0 0 0 0 0 0 1 1 1
4 5 0 0 0 0 0 1 1 1 1 1 1 1
5 9 0 1 1 1 1 1 1 1 1 1 1 1
6 5 0 0 0 0 1 1 1 1 1 1 1 1
7 10 1 1 1 1 1 1 1 1 1 1 1 1
8 3 0 0 0 0 0 0 1 1 1 1 1 1
9 7 0 0 1 1 1 1 1 1 1 1 1 1
10 8 0 0 1 1 1 1 1 1 1 1 1 1
;
run;
Thanks!
Hello @hr667 and welcome to the SAS Support Communities!
You can count the 1s in the substring consisting of the first date-1 characters of the concatenation of variables t1, t2, ... and then select those observations where this count is at least 2.
data want;
set example;
if countc(substrn(cat(of t:),1,date-1),'1')>1;
run;
If needed, add a WHERE condition to restrict processing to observations with non-missing dates.
Edit: The above suggestion assumes that the criterion is: "The condition is met at least twice before treatment start."
If, instead, it is "The condition is met at time point t(x-2), where t(x) is the time point of treatment start" or "The condition is met at time points t(x-2) and t(x-1), where t(x) is the time point of treatment start", then use the first or second suggestion below, respectively:
data want2;
set example;
array t[*] t:;
if date>2 & t[date-2];
run;
data want3;
set example;
array t[*] t:;
if date>2 & t[date-2] & t[date-1];
run;
Hello @hr667 and welcome to the SAS Support Communities!
You can count the 1s in the substring consisting of the first date-1 characters of the concatenation of variables t1, t2, ... and then select those observations where this count is at least 2.
data want;
set example;
if countc(substrn(cat(of t:),1,date-1),'1')>1;
run;
If needed, add a WHERE condition to restrict processing to observations with non-missing dates.
Edit: The above suggestion assumes that the criterion is: "The condition is met at least twice before treatment start."
If, instead, it is "The condition is met at time point t(x-2), where t(x) is the time point of treatment start" or "The condition is met at time points t(x-2) and t(x-1), where t(x) is the time point of treatment start", then use the first or second suggestion below, respectively:
data want2;
set example;
array t[*] t:;
if date>2 & t[date-2];
run;
data want3;
set example;
array t[*] t:;
if date>2 & t[date-2] & t[date-1];
run;
@FreelanceReinh thank you so much, your code worked perfectly. It was the third case you gave that exactly fit with what I was trying to do. Thanks again!
Here is code to determine where the first occurrence of a value of 1 is in the sequence t1-t12. Once you know that, one more command lets you decide where whether it was two time points before the beginning of treatment, and I leave that to you as a homework assignment.
data want;
set example;
first1=whichn(1,of t1-t12);
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.