BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MiraKr_
Obsidian | Level 7

Hello,

I am trying the determine participation by month. Each patient has as a start and an end date in the dataset. I would like to check if the patient was present in Jan, Feb, Mar,...Dec. At the end, I need variables for Jan--Dec with values of 0/1 for each patient. The study spans b/w 2015-05-21 --- 2017-07-29. 

 

I would like to avoid writing a long code that looks like this:

if STARTDT le ('01JAN2016'd) le ENDDT then January=1;
else if STARTDT le ('02JAN2016'd) le ENDDT then January=1;
....
else if STARTDT le ('31JAN2016'd) le ENDDT then January=1;

 

Thank you for your insights regarding a more efficient way to do this!

 

MK

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

OK, here's a reasonable way to proceed.

 

data want;

set have;

array flags {12} January February March April May June July August September October November December;

n_months = intck('month', startdt, enddt);

starting_month = month(startdt);

do k=0 to min(11, n_months);

   mm = starting_month + k;

   if mm > 12 then mm = mm - 12;

   flags{mm} = 1;

end;

run;

 

It's untested, but looks about right.  See what you get.

View solution in original post

10 REPLIES 10
Astounding
PROC Star

Let's start with the planning first.  Your study spans over two years.  Do  you really want January=1 to indicate any January, so you can't tell where it's January 2016 or January 2017 or both?

MiraKr_
Obsidian | Level 7

The year is not important. So,  January=1 is any January.  Also, If a patient was present in Jan 2016 and Jan 2017, January=1. 

Thank you,

Mira

Astounding
PROC Star

OK, here's a reasonable way to proceed.

 

data want;

set have;

array flags {12} January February March April May June July August September October November December;

n_months = intck('month', startdt, enddt);

starting_month = month(startdt);

do k=0 to min(11, n_months);

   mm = starting_month + k;

   if mm > 12 then mm = mm - 12;

   flags{mm} = 1;

end;

run;

 

It's untested, but looks about right.  See what you get.

MiraKr_
Obsidian | Level 7

Hi --,

This solution works very well! 🙂

 

Thank you so much!

 

MK

MiraKr_
Obsidian | Level 7

Astounding, would you mind to add some comments to this part of your code to help me understand what these statements do: 

 

do k=0 to min(11, n_months);

mm = starting_month + k;

if mm > 12 then mm = mm - 12;

flags{mm} = 1;

end;

 

I understand the rest of the code. 

 

Many thanks!

MK

 

Astounding
PROC Star

Working backwards.....

 

flags{mm}=1;

Take one of the 12 variables stored in the array named FLAGS, and set its value to 1.  Which variable?  It depends on the value of mm, which had better range from 1 to 12 since the array contains 12 elements.

 

if mm > 12 then mm = mm - 12;

Since mm is to be used to determine which of the 12 variables gets set to 1, its value needs to range from 1 to 12.  If it exceeds 12, instead of looking for month 13, subtract 12 so we're only looking for month 1 instead.

 

mm = starting_month + k;

Which month are we looking for?  Remember, starting_month is the month (1 thru 12) of STARTDT.  So begin at that month (k=0, so mm=starting_month).  As the loop continues, k increases.  So each time through the loop, look for the next highest month.

 

min(11, n_months)

There's no need to cycle through any more than 12 months.  After 12 times, all 12 flags will be 1.

MiraKr_
Obsidian | Level 7

Thank you for the explanation, Astounding! 

Tom
Super User Tom
Super User

To test if two intervals S1 to E2 and S2 to E2 overlap you need to test

s1 <= e2 and e1 >= s2

To test all 12 months you just need to test 12 months starting with the month that STARTDT is in.

Let's make some test data.

data have ;
  input startdt enddt ;
  format startdt enddt yymmdd10. ;
  informat startdt enddt yymmdd. ;
cards;
20180101 20180130
20160229 20170301
20160305 20160402
;

Now let's create the flags.

data want ;
  set have ;
  array months jan feb mar apr may jun jul aug sep oct nov dec ;
  do i=0 to 11 ;
    months(1+mod(month(startdt)+i-1,12))
       = startdt <= intnx('month',startdt,i,'e')  and enddt >= intnx('month',startdt,i,'b')
    ;
  end;
  drop i;
run;

image.png

MiraKr_
Obsidian | Level 7

Thank you, Tom. This solution works very well and produces identical results to Astounding's code and also flags the 0s (instead of assigning missing).  Would you mind to explain what this part of the code does:

 months(1+mod(month(startdt)+i-1,12))

Thank you,

MK 

Tom
Super User Tom
Super User

@MiraKr_ wrote:

Thank you, Tom. This solution works very well and produces identical results to Astounding's code and also flags the 0s (instead of assigning missing).  Would you mind to explain what this part of the code does:

 months(1+mod(month(startdt)+i-1,12))

Thank you,

MK 


It is the left part of an assignment statement that continues on the following lines.

Basically is indexing into the proper entry in the array named MONTHS for the current iteration of the DO loop.

You can work out the math yourself to see how it gets to the proper index number.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1552 views
  • 0 likes
  • 3 in conversation