DATA Step, Macro, Functions and more

Calculate date intervals using a date as a reference

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

Calculate date intervals using a date as a reference

I am trying to generate a count of how many total visits there were in a specific time period using the start date as a reference. There are 400+ observations and 100 visits. 

 

Study design: On the start date, participants were expected to return to the clinic at 12 months ± 3 months, 24 months ± 3 months, and 36 months ± 3 months. 

 

Data: (example)

ID       Startdate        visit1                 visit2                visit3     ...        visit100

123    12MAR2009    13MAR2010     2AUG2010      4JUN2012       5DEC2015 

456    14MAR2009    16MAR2010     4AUG2010      5JUN2012       10DEC2015 

789    16MAR2009    19MAR2010     6AUG2010      6JUN2012       15DEC2015 

 

What I need to know is if a patient has a visit within the range of:

  • 9-15 months from the start date for 12 month visit and
  • 21-27 months from the start date for 24 month visit and
  • 33-39 months from the start date for 36 month visit

 

Using the attached set, this is an example of what the reporting table would look like…

12 month visits

29*

24 month visits

18

36 month visits

10

*This means that there were 29 visits that occurred between 9-15 months from the start date.


Accepted Solutions
Solution
‎09-07-2016 03:12 AM
Super User
Posts: 6,972

Re: Calculate date intervals using a date as a reference

First, transpose from wide to long format, then cumulate per ID:

proc transpose data=have out=visits (drop=_NAME_ rename=(COL1=visit));
by ID;
copy startdate;
run;

data want (keep=ID visit12 visit24 visit36);
set visits;
by ID;
retain start visit12 visit24 visit36;
if first.ID
then do
  start = startdate;
  visit12 = 0;
  visit24 = 0;
  visit36 = 0;
end;
if intnx('month',start,9) <= visit < intnx('month',start,16) then visit12 + 1;
if intnx('month',start,21) <= visit < intnx('month',start,28) then visit24 + 1;
if intnx('month',start,33) <= visit < intnx('month',start,40) then visit36 + 1;
if last.ID then output;
run;

To get total sums, just don't reset the counters with every group change, and provide only 1 output at EOF.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Super User
Super User
Posts: 7,430

Re: Calculate date intervals using a date as a reference

As with most problems, it is your data structure which makes your problem difficult.  For programming purposes it is far easier to program with, and maintain if you use normalised data structures - fixed structure with data in the dataset.  This makes your problem very simple:

data have;
  id=123; startdate="12mar2009"d; visit1="13mar2010"d; visit2="02aug2010"d;
  format startdate visit: date9.;
run;

proc transpose data=have out=inter;
  by id startdate;
  var visit:;
run;

data want;
  set inter;
  length month9_15 month21_27 $1;
  retain month9_15 month21_27;
  by id;
  if intnx("month",startdate,9) <= col1 <= intnx("month",startdate,15) then month9_15="Y";
  if intnx("month",startdate,21) <= col1 <= intnx("month",startdate,27) then month21_27="Y";
  if last.id then output;
run;

For example gives you a dataset with one row per subject and a flag for each visit, its then a simple proc means (or you could total in that datastep).

Solution
‎09-07-2016 03:12 AM
Super User
Posts: 6,972

Re: Calculate date intervals using a date as a reference

First, transpose from wide to long format, then cumulate per ID:

proc transpose data=have out=visits (drop=_NAME_ rename=(COL1=visit));
by ID;
copy startdate;
run;

data want (keep=ID visit12 visit24 visit36);
set visits;
by ID;
retain start visit12 visit24 visit36;
if first.ID
then do
  start = startdate;
  visit12 = 0;
  visit24 = 0;
  visit36 = 0;
end;
if intnx('month',start,9) <= visit < intnx('month',start,16) then visit12 + 1;
if intnx('month',start,21) <= visit < intnx('month',start,28) then visit24 + 1;
if intnx('month',start,33) <= visit < intnx('month',start,40) then visit36 + 1;
if last.ID then output;
run;

To get total sums, just don't reset the counters with every group change, and provide only 1 output at EOF.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
New Contributor
Posts: 3

Re: Calculate date intervals using a date as a reference

Hi - When I attempted to transpose the set, the date value was lost? The start date also only shows up on the first ID and should show up for each visit.
Super User
Posts: 6,972

Re: Calculate date intervals using a date as a reference


JKS1 wrote:
Hi - When I attempted to transpose the set, the date value was lost? The start date also only shows up on the first ID and should show up for each visit.

You probably meant that the FORMAT was lost. That's not a big issue, you can assign the format later when you need to display values.

Since you are looking for the sums of visits, anyway, you won't need it at all in intermediate datasets.

And the second issue is resolved by keeping the startdate value in RETAINed variable, as I did in my data step.

 

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 5,099

Re: Calculate date intervals using a date as a reference

Believe it or not, the programming will be easy.  The hard part is defining the solution.  For example, with a start date of 12MAR2009, what time period should the 9-15 month interval include?  Do you want to count days rather than months to define the interval?  (Consider what should happen if the start date were 30MAY2009 and there is no such thing as 30FEB2010.)  If a person has multiple visits within that interval, should all the person's visits be counted or just 1?

 

Once those things are defined, running through an array in a DATA step will be easy.

New Contributor
Posts: 3

Re: Calculate date intervals using a date as a reference

Hi Astounding, I was able to program this in STATA using this code:
replace visit36mon=1 if (vs1-artstartdate)>=1005 & (vs1-artstartdate)<=1185 & (vs1~=.) |

--> This program counts any visit occurring with a date (by day) range as =1.
Super User
Posts: 5,099

Re: Calculate date intervals using a date as a reference

[ Edited ]

Perfect.  You can do the same in SAS.  It will make a difference, though, whether your DATE variables are actually numeric SAS dates, or whether they are character strings.  If they are numeric SAS dates (the preferred method for many reasons):

 

data want;

set have;

array visits {100} visit1-visit100;

_12_mo_visits=0;

_24_mo_visits=0;

_36_mo_visits=0;

do _n_=1 to 100;

   if startdate + 1005 <= visits{_n_} <= startdate + 1185 then _36_mo_visits + 1;

   * Add 2 more statements here, for the 12-month and 24-month counts;

end;

run;

 

That gives you 3 new variables in the data set, with counts per observation:  how many visits in each of the ranges.  You can add them up easily:

 

proc means data=want sum;

var _12_mo_visits _24_mo_visits _36_mo_visits;

run;

 

Or  you can count patients with each level of # of visits instead:

 

proc freq data=want;

   tables _12_mo_visits _24_mo_visits _36_mo_visits;

run;

 

Good luck.

 

*** EDITED:  If your date variables are actually character strings rather than numeric SAS dates, only one piece of the program needs to change:

 

 

   if startdate + 1005 <= input(visits{_n_}, date9.) <= startdate + 1185 then _36_mo_visits + 1;

 

The same would apply to STARTDATE, depending on whether that is a numeric SAS date vs. a character string.

Super User
Posts: 9,691

Re: Calculate date intervals using a date as a reference

I think no need proc transpose.
Change Kurt's code :


data want (keep=ID visit12 visit24 visit36);
set visits;
array x{*} visit: ;

  _visit12 = 0;
  _visit24 = 0;
  _visit36 = 0;

do i=1 to dim(x);
if intnx('month',start,9) <= x{i}< intnx('month',start,16) then _visit12 + 1;
if intnx('month',start,21) <= x{i} < intnx('month',start,28) then _visit24 + 1;
if intnx('month',start,33) <= x{i} < intnx('month',start,40) then _visit36 + 1;
end;
run;


☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 386 views
  • 3 likes
  • 5 in conversation