BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JKS1
Fluorite | Level 6

I am trying to generate a count of how many total visits there were in a specific time period using the start date as a reference. There are 400+ observations and 100 visits. 

 

Study design: On the start date, participants were expected to return to the clinic at 12 months ± 3 months, 24 months ± 3 months, and 36 months ± 3 months. 

 

Data: (example)

ID       Startdate        visit1                 visit2                visit3     ...        visit100

123    12MAR2009    13MAR2010     2AUG2010      4JUN2012       5DEC2015 

456    14MAR2009    16MAR2010     4AUG2010      5JUN2012       10DEC2015 

789    16MAR2009    19MAR2010     6AUG2010      6JUN2012       15DEC2015 

 

What I need to know is if a patient has a visit within the range of:

  • 9-15 months from the start date for 12 month visit and
  • 21-27 months from the start date for 24 month visit and
  • 33-39 months from the start date for 36 month visit

 

Using the attached set, this is an example of what the reporting table would look like…

12 month visits

29*

24 month visits

18

36 month visits

10

*This means that there were 29 visits that occurred between 9-15 months from the start date.

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

First, transpose from wide to long format, then cumulate per ID:

proc transpose data=have out=visits (drop=_NAME_ rename=(COL1=visit));
by ID;
copy startdate;
run;

data want (keep=ID visit12 visit24 visit36);
set visits;
by ID;
retain start visit12 visit24 visit36;
if first.ID
then do
  start = startdate;
  visit12 = 0;
  visit24 = 0;
  visit36 = 0;
end;
if intnx('month',start,9) <= visit < intnx('month',start,16) then visit12 + 1;
if intnx('month',start,21) <= visit < intnx('month',start,28) then visit24 + 1;
if intnx('month',start,33) <= visit < intnx('month',start,40) then visit36 + 1;
if last.ID then output;
run;

To get total sums, just don't reset the counters with every group change, and provide only 1 output at EOF.

View solution in original post

8 REPLIES 8
RW9
Diamond | Level 26 RW9
Diamond | Level 26

As with most problems, it is your data structure which makes your problem difficult.  For programming purposes it is far easier to program with, and maintain if you use normalised data structures - fixed structure with data in the dataset.  This makes your problem very simple:

data have;
  id=123; startdate="12mar2009"d; visit1="13mar2010"d; visit2="02aug2010"d;
  format startdate visit: date9.;
run;

proc transpose data=have out=inter;
  by id startdate;
  var visit:;
run;

data want;
  set inter;
  length month9_15 month21_27 $1;
  retain month9_15 month21_27;
  by id;
  if intnx("month",startdate,9) <= col1 <= intnx("month",startdate,15) then month9_15="Y";
  if intnx("month",startdate,21) <= col1 <= intnx("month",startdate,27) then month21_27="Y";
  if last.id then output;
run;

For example gives you a dataset with one row per subject and a flag for each visit, its then a simple proc means (or you could total in that datastep).

Kurt_Bremser
Super User

First, transpose from wide to long format, then cumulate per ID:

proc transpose data=have out=visits (drop=_NAME_ rename=(COL1=visit));
by ID;
copy startdate;
run;

data want (keep=ID visit12 visit24 visit36);
set visits;
by ID;
retain start visit12 visit24 visit36;
if first.ID
then do
  start = startdate;
  visit12 = 0;
  visit24 = 0;
  visit36 = 0;
end;
if intnx('month',start,9) <= visit < intnx('month',start,16) then visit12 + 1;
if intnx('month',start,21) <= visit < intnx('month',start,28) then visit24 + 1;
if intnx('month',start,33) <= visit < intnx('month',start,40) then visit36 + 1;
if last.ID then output;
run;

To get total sums, just don't reset the counters with every group change, and provide only 1 output at EOF.

JKS1
Fluorite | Level 6
Hi - When I attempted to transpose the set, the date value was lost? The start date also only shows up on the first ID and should show up for each visit.
Kurt_Bremser
Super User

@JKS1 wrote:
Hi - When I attempted to transpose the set, the date value was lost? The start date also only shows up on the first ID and should show up for each visit.

You probably meant that the FORMAT was lost. That's not a big issue, you can assign the format later when you need to display values.

Since you are looking for the sums of visits, anyway, you won't need it at all in intermediate datasets.

And the second issue is resolved by keeping the startdate value in RETAINed variable, as I did in my data step.

 

Astounding
PROC Star

Believe it or not, the programming will be easy.  The hard part is defining the solution.  For example, with a start date of 12MAR2009, what time period should the 9-15 month interval include?  Do you want to count days rather than months to define the interval?  (Consider what should happen if the start date were 30MAY2009 and there is no such thing as 30FEB2010.)  If a person has multiple visits within that interval, should all the person's visits be counted or just 1?

 

Once those things are defined, running through an array in a DATA step will be easy.

JKS1
Fluorite | Level 6
Hi Astounding, I was able to program this in STATA using this code:
replace visit36mon=1 if (vs1-artstartdate)>=1005 & (vs1-artstartdate)<=1185 & (vs1~=.) |

--> This program counts any visit occurring with a date (by day) range as =1.
Astounding
PROC Star

Perfect.  You can do the same in SAS.  It will make a difference, though, whether your DATE variables are actually numeric SAS dates, or whether they are character strings.  If they are numeric SAS dates (the preferred method for many reasons):

 

data want;

set have;

array visits {100} visit1-visit100;

_12_mo_visits=0;

_24_mo_visits=0;

_36_mo_visits=0;

do _n_=1 to 100;

   if startdate + 1005 <= visits{_n_} <= startdate + 1185 then _36_mo_visits + 1;

   * Add 2 more statements here, for the 12-month and 24-month counts;

end;

run;

 

That gives you 3 new variables in the data set, with counts per observation:  how many visits in each of the ranges.  You can add them up easily:

 

proc means data=want sum;

var _12_mo_visits _24_mo_visits _36_mo_visits;

run;

 

Or  you can count patients with each level of # of visits instead:

 

proc freq data=want;

   tables _12_mo_visits _24_mo_visits _36_mo_visits;

run;

 

Good luck.

 

*** EDITED:  If your date variables are actually character strings rather than numeric SAS dates, only one piece of the program needs to change:

 

 

   if startdate + 1005 <= input(visits{_n_}, date9.) <= startdate + 1185 then _36_mo_visits + 1;

 

The same would apply to STARTDATE, depending on whether that is a numeric SAS date vs. a character string.

Ksharp
Super User
I think no need proc transpose.
Change Kurt's code :


data want (keep=ID visit12 visit24 visit36);
set visits;
array x{*} visit: ;

  _visit12 = 0;
  _visit24 = 0;
  _visit36 = 0;

do i=1 to dim(x);
if intnx('month',start,9) <= x{i}< intnx('month',start,16) then _visit12 + 1;
if intnx('month',start,21) <= x{i} < intnx('month',start,28) then _visit24 + 1;
if intnx('month',start,33) <= x{i} < intnx('month',start,40) then _visit36 + 1;
end;
run;


sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1512 views
  • 3 likes
  • 5 in conversation