Hello,
I am trying the determine participation by month. Each patient has as a start and an end date in the dataset. I would like to check if the patient was present in Jan, Feb, Mar,...Dec. At the end, I need variables for Jan--Dec with values of 0/1 for each patient. The study spans b/w 2015-05-21 --- 2017-07-29.
I would like to avoid writing a long code that looks like this:
if STARTDT le ('01JAN2016'd) le ENDDT then January=1;
else if STARTDT le ('02JAN2016'd) le ENDDT then January=1;
....
else if STARTDT le ('31JAN2016'd) le ENDDT then January=1;
Thank you for your insights regarding a more efficient way to do this!
MK
OK, here's a reasonable way to proceed.
data want;
set have;
array flags {12} January February March April May June July August September October November December;
n_months = intck('month', startdt, enddt);
starting_month = month(startdt);
do k=0 to min(11, n_months);
mm = starting_month + k;
if mm > 12 then mm = mm - 12;
flags{mm} = 1;
end;
run;
It's untested, but looks about right. See what you get.
Let's start with the planning first. Your study spans over two years. Do you really want January=1 to indicate any January, so you can't tell where it's January 2016 or January 2017 or both?
The year is not important. So, January=1 is any January. Also, If a patient was present in Jan 2016 and Jan 2017, January=1.
Thank you,
Mira
OK, here's a reasonable way to proceed.
data want;
set have;
array flags {12} January February March April May June July August September October November December;
n_months = intck('month', startdt, enddt);
starting_month = month(startdt);
do k=0 to min(11, n_months);
mm = starting_month + k;
if mm > 12 then mm = mm - 12;
flags{mm} = 1;
end;
run;
It's untested, but looks about right. See what you get.
Hi --,
This solution works very well! 🙂
Thank you so much!
MK
Astounding, would you mind to add some comments to this part of your code to help me understand what these statements do:
do k=0 to min(11, n_months);
mm = starting_month + k;
if mm > 12 then mm = mm - 12;
flags{mm} = 1;
end;
I understand the rest of the code.
Many thanks!
MK
Working backwards.....
flags{mm}=1;
Take one of the 12 variables stored in the array named FLAGS, and set its value to 1. Which variable? It depends on the value of mm, which had better range from 1 to 12 since the array contains 12 elements.
if mm > 12 then mm = mm - 12;
Since mm is to be used to determine which of the 12 variables gets set to 1, its value needs to range from 1 to 12. If it exceeds 12, instead of looking for month 13, subtract 12 so we're only looking for month 1 instead.
mm = starting_month + k;
Which month are we looking for? Remember, starting_month is the month (1 thru 12) of STARTDT. So begin at that month (k=0, so mm=starting_month). As the loop continues, k increases. So each time through the loop, look for the next highest month.
min(11, n_months)
There's no need to cycle through any more than 12 months. After 12 times, all 12 flags will be 1.
Thank you for the explanation, Astounding!
To test if two intervals S1 to E2 and S2 to E2 overlap you need to test
s1 <= e2 and e1 >= s2
To test all 12 months you just need to test 12 months starting with the month that STARTDT is in.
Let's make some test data.
data have ;
input startdt enddt ;
format startdt enddt yymmdd10. ;
informat startdt enddt yymmdd. ;
cards;
20180101 20180130
20160229 20170301
20160305 20160402
;
Now let's create the flags.
data want ;
set have ;
array months jan feb mar apr may jun jul aug sep oct nov dec ;
do i=0 to 11 ;
months(1+mod(month(startdt)+i-1,12))
= startdt <= intnx('month',startdt,i,'e') and enddt >= intnx('month',startdt,i,'b')
;
end;
drop i;
run;
Thank you, Tom. This solution works very well and produces identical results to Astounding's code and also flags the 0s (instead of assigning missing). Would you mind to explain what this part of the code does:
months(1+mod(month(startdt)+i-1,12))
Thank you,
MK
@MiraKr_ wrote:
Thank you, Tom. This solution works very well and produces identical results to Astounding's code and also flags the 0s (instead of assigning missing). Would you mind to explain what this part of the code does:
months(1+mod(month(startdt)+i-1,12))
Thank you,
MK
It is the left part of an assignment statement that continues on the following lines.
Basically is indexing into the proper entry in the array named MONTHS for the current iteration of the DO loop.
You can work out the math yourself to see how it gets to the proper index number.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.