BookmarkSubscribeRSS Feed
raajdesaii
Fluorite | Level 6

Hello all,

 

I have dataset for a patient population with a start and an end date. I have also have the date for their visits to the hospital. I want to create a new variable which counts their visits to the hospital including the months with zero visits. 

 

The dataset looks like this:

IDStartdateEnddateVisitdate
115JAN201609AUG201610MAR2016
115JAN201609AUG201613MAR2016
115JAN201609AUG201609JUN2016
208MAR201612JUL201615MAR2016
208MAR201612JUL201625MAR2016
208MAR201612JUL20166JUNE2016
208MAR201612JUL20163JUL2016

 

 

What I want is:

IDStartdateEnddateVisitdateNewdateCount
115JAN201609AUG201610MAR2016JAN20160
115JAN201609AUG201613MAR2016FEB20160
115JAN201609AUG201609JUN2016MAR20162
115JAN201609AUG2016 APR20160
115JAN201609AUG2016 MAY20160
115JAN201609AUG2016 JUN20161
115JAN201609AUG2016 JUL20160
115JAN201609AUG2016 AUG20160
208MAR201612JUL201615MAR2016MAR20162
208MAR201612JUL201625MAR2016APR20160
208MAR201612JUL20166JUNE2016MAY20160
208MAR201612JUL20163JUL2016JUN20161
208MAR201612JUL2016 JUL20161

 

I am looking to create these two variables in the dataset so I can get the counts for the visits by each month between the first and last date. Please help me find a solution for this. 

 

Thanks!

5 REPLIES 5
PhilC
Rhodochrosite | Level 12

what if the number of visits is greater than the number of months between Startdate and Enddate?

raajdesaii
Fluorite | Level 6
The dates for the visits lie between the start and end dates.
RichardDeVen
Barite | Level 11

The 'WANT' dataset does not make much sense on it's own.  You should not mix aggregate data (those counts) with detail data in the same row.

 

What if an ID had 30 detail records for visits over 3 months ? You would a lot more detail (30) than aggregates (3).

 

A more reasonable 'want' would look like

id month count
-- ----- -----

Perhaps reported in a layout such as

id startmonth visits_count_in_month_1 ...to... visits_count_in_month_<n>
-- --------- --- --- --- --- --- --- --- --- --- ---
1 15JAN2016 0 0 2 0 0 1 0 0

where <n> is the number of months start to end.

 

Example:

 

data have;
input ID Startdate	Enddate	Visitdate;
  format startdate enddate visitdate date9.;
informat startdate enddate visitdate date9.;
datalines;
1	15JAN2016	09AUG2016	10MAR2016
1	15JAN2016	09AUG2016	13MAR2016
1	15JAN2016	09AUG2016	09JUN2016
2	08MAR2016	12JUL2016	15MAR2016
2	08MAR2016	12JUL2016	25MAR2016
2	08MAR2016	12JUL2016	6JUN2016
2	08MAR2016	12JUL2016	3JUL2016
;

data want_way1;
  * array using direct index element for frequency counting;
  * index is the numeric sas date value for a month;

  array months(0:%sysfunc(today())) _temporary_;

  do until (last.id);
    set have;
    by id;

    months(intnx('month', visitdate, 0)) + 1;
  end;

  startmonth =intnx('month', startdate, 0);

  do index = 0 to intck('month', startdate, enddate);
    month = intnx('month', startdate, index);
    count = coalesce(months(month),0);
    output; 
  end;

  call missing (of months(*));

  keep id index month count startmonth;
  format month startmonth nldateymm.;
run;

ods html file='wide.html' style=plateau; 

proc report data=want_way1;
 columns id startmonth count,index;
 define id / group;
 define startmonth / min;
 define index / ' ' across;
 define count / 'Freq in month ';
run;

ods html close;

Produces

data

RichardADeVenezia_1-1609264615702.png

 

and report output

RichardADeVenezia_0-1609264565530.png

 

A second way to get the counts is to create a months data set covering each id.  Aggregate a join of the dates coverage with the details data grouping by id and month.

 

Example:

data months_per_id;
  set have;
  by id;
  if first.id;
  do index = 0 to intck('month', startdate, enddate);
    month = intnx('month', startdate, index);
    output; 
  end;
  keep id index month; format month nldateymm.;
run;

proc sql;
  create table want_way2 as
  select M.id, index, month, sum(intck('month', month, visitdate)=0) as count
  from have join months_per_id as M
  on have.id = M.id
  group by M.id, M.index, M.month
  ;
PGStats
Opal | Level 21

You don't need that awkward data structure to get your counts. Try this way:

 

/* Create the required months */
data months;
do until(last.id);
    set have; by id;
    firstMonth = min(intnx("month", startdate, 0), firstmonth);
    lastMonth = max(intnx("month", enddate, 0), lastmonth);
    end;
monthDate = firstMonth;
do until(monthDate > lastMonth);
    output;
    monthDate = intnx("month", monthDate, 1);
    end;
format monthDate monyy7.;
keep id monthDate;
run;

/* Count the visits per month */
proc sql;
create table monthlyVisits as
select
    a.id,
    a.monthDate,
    count(visitDate) as nbVisits
from 
    months as a left join
    have as b on a.id=b.id and intnx("month", b.visitDate, 0) = a.monthDate
group by a.id, a.monthDate;
quit;
PG
PhilC
Rhodochrosite | Level 12

First, @raajdesaii , I will leave it to you to add the Visitdate column as you would like it.  Aside from that, this class of problem is interesting to me because I face it in my work too.  I think this is the way I handle this kind of problem.  It needs these five steps, I say, if for anything, to be able understand what we had programmed when we look back six months later.

data IDs;
  set have (Keep=ID	Startdate	Enddate);
    by ID;
  if first.ID;
run;

data AllNewDates (sortedby=ID Newdate);
  set IDs;

  Newdate=intnx('month',Startdate,0);
  do while (Newdate<Enddate);
      Count=0;
      output;
    Newdate=intnx('month',Newdate,1);
  end;
run;

data Newdates0 (Keep=ID	Newdate);
  set have (Keep=ID	Visitdate);

  Newdate=intnx('month',Visitdate,0);
  format Newdate date9.;
run;

proc freq data=Newdates0;
  by ID;
  table Newdate / out=Newdates(drop=PERCENT) noprint;
  format Newdate date9.;
run;

data want;
  update AllNewDates Newdates;
    by ID Newdate;
  format Newdate monyy7.;
run;

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 6430 views
  • 0 likes
  • 4 in conversation