BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PharmlyDoc
Quartz | Level 8

I find this difficult to articulate. 

I have a table of patients with dates that they had a visit.

For each patient I want to calculate their average number of visits per month. The total amount of months is 24 (1/1/2018  -  12/31/2019). 

The ultimate goal is so I can divide the patients by an outcome variable and then say that "patients in the outcome group had on average higher/fewer visits per month than the group without the outcome." Or "on average, there were __ visits per patient per month."

Each patient visit is accompanied by a unique CSN ID (Contact Serial Number). Some patients have duplicate CSN IDs on the same day.

 

The patient table has the following format:

PAT_ID CSN CONTACT_DATE
1 3004537 05Jan2018
1 3006970 01Feb2018
1 3004089 01Mar2018
1 3004089 01Mar2018
1 3003421 01Dec2018
1 3004775 05Jan2019
1 3002783 01May2018
2 3004235 01Feb2018
2 3004415 01Apr2018
2 3006463 01Aug2018
2 3005423 01Feb2019
3 3007317 01Jul2018
3 3005789 01Oct2018
3 3008867 01May2019
3 3008867 01May2019
4 3008268 01Apr2018
5 3001485 01Mar2018
5 3007350 01Nov2018
5 3001745 01Feb2019
5 3008913 01Dec2019
6 3002585 01Oct2018
6 3002586 01Nov2018
6 3002586 01Nov2018
6 3002588 05Jan2019
6 3002588 05Jan2019
6 3002590 01Nov2019
6 3002590 01Nov2019
6 3002593 01Dec2019
6 3002593 01Dec2019
7 3002594 01Aug2019
7 3002595 01Aug2019
8 3002596 01Jan2018
8 3002597 22Jan2018
8 3002598 12Feb2018
8 3002599 05Mar2018
8 3002599 05Mar2018
8 3002601 16Apr2018
8 3002602 07May2018
8 3002603 28May2018
8 3002604 18Jun2018
8 3002605 09Jul2018
8 3002605 09Jul2018
9 3002607 08May2018
9 3002608 05Jan2019
9 3002609 01Nov2019
10 3002610 05Jan2019
10 3002611 01Jun2019
10 3002612 30Dec2019
11 3002613 01Feb2018
11 3002614 15Mar2018
11 3002615 26Apr2018
11 3002615 26Apr2018
11 3002617 19Jul2018
11 3002618 30Aug2018
11 3002619 11Oct2018
11 3002619 11Oct2018
11 3002621 03Jan2019
11 3002622 14Feb2019
11 3002623 28Mar2019
12 3002624 01Jun2018
12 3002625 01Feb2019
12 3002626 01Sep2019
12 3002627 20Dec2019
13 3002628 01Mar2018
13 3002629 24May2018
13 3002630 16Aug2018
13 3002631 08Nov2018
13 3002632 31Jan2019
13 3002633 25Apr2019
13 3002634 18Jul2019
13 3002635 10Oct2019
13 3002635 10Oct2019
13 3002637 26Nov2019
13 3002638 20Dec2019
14 3002639 05Jan2018
14 3002640 30Mar2018
14 3002641 22Jun2018
14 3002642 14Sep2018
15 3002643 01Mar2018
15 3002644 03May2018
15 3002645 05Jul2018
15 3002646 06Sep2018
15 3002647 08Nov2018
15 3002648 10Jan2019
15 3002649 14Mar2019
15 3002650 16May2019
15 3002650 16May2019
15 3002652 19Sep2019
15 3002653 15Oct2019
15 3002654 26Nov2019
15 3002654 26Nov2019
16 3002656 01Feb2019
16 3002657 01May2019
16 3002658 01Aug2019
16 3002659 01Nov2019
17 3002660 05Jan2018
17 3002661 01Apr2018
18 3002662 01May2018
18 3002663 01Sep2018
18 3002664 01Dec2018
19 3002665 01Apr2018
19 3002666 31May2018
19 3002667 30Jul2018
19 3002668 28Sep2018
19 3002669 27Nov2018
19 3002670 26Jan2019
20 3002671 02Jan2018
20 3002672 13Feb2018
20 3002673 27Mar2018
20 3002674 08May2018
20 3002675 19Jun2018
20 3002676 31Jul2018
20 3002677 11Sep2018
20 3002678 23Oct2018
20 3002679 04Dec2018
20 3002680 15Jan2019
20 3002682 09Apr2019
20 3002682 09Apr2019

 

data patients;
infile datalines dlm=',';
input PAT_ID CSN CONTACT_DATE : DATE. ;
format CONTACT_DATE DATE9.;
datalines;
1,3004537,05Jan2018
1,3006970,01Feb2018
1,3004089,01Mar2018
1,3004089,01Mar2018
1,3003421,01Dec2018
1,3004775,05Jan2019
1,3002783,01May2018
2,3004235,01Feb2018
2,3004415,01Apr2018
2,3006463,01Aug2018
2,3005423,01Feb2019
3,3007317,01Jul2018
3,3005789,01Oct2018
3,3008867,01May2019
3,3008867,01May2019
4,3008268,01Apr2018
5,3001485,01Mar2018
5,3007350,01Nov2018
5,3001745,01Feb2019
5,3008913,01Dec2019
6,3002585,01Oct2018
6,3002586,01Nov2018
6,3002586,01Nov2018
6,3002588,05Jan2019
6,3002588,05Jan2019
6,3002590,01Nov2019
6,3002590,01Nov2019
6,3002593,01Dec2019
6,3002593,01Dec2019
7,3002594,01Aug2019
7,3002595,01Aug2019
8,3002596,01Jan2018
8,3002597,22Jan2018
8,3002598,12Feb2018
8,3002599,05Mar2018
8,3002599,05Mar2018
8,3002601,16Apr2018
8,3002602,07May2018
8,3002603,28May2018
8,3002604,18Jun2018
8,3002605,09Jul2018
8,3002605,09Jul2018
9,3002607,08May2018
9,3002608,05Jan2019
9,3002609,01Nov2019
10,3002610,05Jan2019
10,3002611,01Jun2019
10,3002612,30Dec2019
11,3002613,01Feb2018
11,3002614,15Mar2018
11,3002615,26Apr2018
11,3002615,26Apr2018
11,3002617,19Jul2018
11,3002618,30Aug2018
11,3002619,11Oct2018
11,3002619,11Oct2018
11,3002621,03Jan2019
11,3002622,14Feb2019
11,3002623,28Mar2019
12,3002624,01Jun2018
12,3002625,01Feb2019
12,3002626,01Sep2019
12,3002627,20Dec2019
13,3002628,01Mar2018
13,3002629,24May2018
13,3002630,16Aug2018
13,3002631,08Nov2018
13,3002632,31Jan2019
13,3002633,25Apr2019
13,3002634,18Jul2019
13,3002635,10Oct2019
13,3002635,10Oct2019
13,3002637,26Nov2019
13,3002638,20Dec2019
14,3002639,05Jan2018
14,3002640,30Mar2018
14,3002641,22Jun2018
14,3002642,14Sep2018
15,3002643,01Mar2018
15,3002644,03May2018
15,3002645,05Jul2018
15,3002646,06Sep2018
15,3002647,08Nov2018
15,3002648,10Jan2019
15,3002649,14Mar2019
15,3002650,16May2019
15,3002650,16May2019
15,3002652,19Sep2019
15,3002653,15Oct2019
15,3002654,26Nov2019
15,3002654,26Nov2019
16,3002656,01Feb2019
16,3002657,01May2019
16,3002658,01Aug2019
16,3002659,01Nov2019
17,3002660,05Jan2018
17,3002661,01Apr2018
18,3002662,01May2018
18,3002663,01Sep2018
18,3002664,01Dec2018
19,3002665,01Apr2018
19,3002666,31May2018
19,3002667,30Jul2018
19,3002668,28Sep2018
19,3002669,27Nov2018
19,3002670,26Jan2019
20,3002671,02Jan2018
20,3002672,13Feb2018
20,3002673,27Mar2018
20,3002674,08May2018
20,3002675,19Jun2018
20,3002676,31Jul2018
20,3002677,11Sep2018
20,3002678,23Oct2018
20,3002679,04Dec2018
20,3002680,15Jan2019
20,3002682,09Apr2019
20,3002682,09Apr2019
;

 

I am able to count the number of distinct CSNs per patient and count the number of visits per patient per month but I get lost when trying to calculate the average visit per patient per month.

proc sql;
create table PAT_Visits as
 select PAT_ID, CSN, CONTACT_DATE, count(distinct CSN) as Visits
  from Patients 
   group by PAT_ID;
quit;

proc summary data=patients nway;
   class PAT_ID CSN CONTACT_DATE ;
   output out=counted(drop=_type_ rename=(_freq_ = count));
run;

proc print data=counted;
run 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Sajid01
Meteorite | Level 14

Hello @PharmlyDoc 

Taking the visit per patient per month during the 24 months period would be (the total number of visits in 24 months)/24. The following code would give this. The variable TOTAL_VISITS  gives the total number of visits during the 24 months period and .AV_VISITS gives the average number of visit per month during the 24 months period.
I assumed that the data set provided in this question represents 24 months of data.

proc sql;
create table AV_VISITS as
select PAT_ID,count(distinct CSN) as TOTAL_VISITS, count(distinct CSN)/24 as AV_VISITS format 5.3 from patients
group by pat_id;
quit;

The AV_VISITS data set will be like this

Sajid01_0-1633788100991.png

 

View solution in original post

3 REPLIES 3
ballardw
Super User

How is CSN actually involved in this? Since CSN is per date you really don't want that in calculating either a total or a mean.

 

If the month and patient counts per month are needed this would be my take:

proc freq data=patients noprint;
   tables pat_id *contact_date / out=patmonthcount;
   format contact_date yymon5.;
run;

proc summary data=patmonthcount nway;
   class pat_id;
   var count;
   output out=meanvisitsmonth (drop=_type_)  mean(count)=meanvisits;
run;

Or are we supposed to use 24 somewhere?

 

A format applied to a variable will create groups that are honored by most procedures, which I use in proc freq to get monthly counts. The average counts of the months with visits.

mkeintz
PROC Star

Are you saying that you do not include months with zero visits in generating a month average?   Otherwise, if the time span is 24 months, you merely need to divide the number of unique visits by 24.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Sajid01
Meteorite | Level 14

Hello @PharmlyDoc 

Taking the visit per patient per month during the 24 months period would be (the total number of visits in 24 months)/24. The following code would give this. The variable TOTAL_VISITS  gives the total number of visits during the 24 months period and .AV_VISITS gives the average number of visit per month during the 24 months period.
I assumed that the data set provided in this question represents 24 months of data.

proc sql;
create table AV_VISITS as
select PAT_ID,count(distinct CSN) as TOTAL_VISITS, count(distinct CSN)/24 as AV_VISITS format 5.3 from patients
group by pat_id;
quit;

The AV_VISITS data set will be like this

Sajid01_0-1633788100991.png

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 3063 views
  • 2 likes
  • 4 in conversation