- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I find this difficult to articulate.
I have a table of patients with dates that they had a visit.
For each patient I want to calculate their average number of visits per month. The total amount of months is 24 (1/1/2018 - 12/31/2019).
The ultimate goal is so I can divide the patients by an outcome variable and then say that "patients in the outcome group had on average higher/fewer visits per month than the group without the outcome." Or "on average, there were __ visits per patient per month."
Each patient visit is accompanied by a unique CSN ID (Contact Serial Number). Some patients have duplicate CSN IDs on the same day.
The patient table has the following format:
PAT_ID | CSN | CONTACT_DATE |
1 | 3004537 | 05Jan2018 |
1 | 3006970 | 01Feb2018 |
1 | 3004089 | 01Mar2018 |
1 | 3004089 | 01Mar2018 |
1 | 3003421 | 01Dec2018 |
1 | 3004775 | 05Jan2019 |
1 | 3002783 | 01May2018 |
2 | 3004235 | 01Feb2018 |
2 | 3004415 | 01Apr2018 |
2 | 3006463 | 01Aug2018 |
2 | 3005423 | 01Feb2019 |
3 | 3007317 | 01Jul2018 |
3 | 3005789 | 01Oct2018 |
3 | 3008867 | 01May2019 |
3 | 3008867 | 01May2019 |
4 | 3008268 | 01Apr2018 |
5 | 3001485 | 01Mar2018 |
5 | 3007350 | 01Nov2018 |
5 | 3001745 | 01Feb2019 |
5 | 3008913 | 01Dec2019 |
6 | 3002585 | 01Oct2018 |
6 | 3002586 | 01Nov2018 |
6 | 3002586 | 01Nov2018 |
6 | 3002588 | 05Jan2019 |
6 | 3002588 | 05Jan2019 |
6 | 3002590 | 01Nov2019 |
6 | 3002590 | 01Nov2019 |
6 | 3002593 | 01Dec2019 |
6 | 3002593 | 01Dec2019 |
7 | 3002594 | 01Aug2019 |
7 | 3002595 | 01Aug2019 |
8 | 3002596 | 01Jan2018 |
8 | 3002597 | 22Jan2018 |
8 | 3002598 | 12Feb2018 |
8 | 3002599 | 05Mar2018 |
8 | 3002599 | 05Mar2018 |
8 | 3002601 | 16Apr2018 |
8 | 3002602 | 07May2018 |
8 | 3002603 | 28May2018 |
8 | 3002604 | 18Jun2018 |
8 | 3002605 | 09Jul2018 |
8 | 3002605 | 09Jul2018 |
9 | 3002607 | 08May2018 |
9 | 3002608 | 05Jan2019 |
9 | 3002609 | 01Nov2019 |
10 | 3002610 | 05Jan2019 |
10 | 3002611 | 01Jun2019 |
10 | 3002612 | 30Dec2019 |
11 | 3002613 | 01Feb2018 |
11 | 3002614 | 15Mar2018 |
11 | 3002615 | 26Apr2018 |
11 | 3002615 | 26Apr2018 |
11 | 3002617 | 19Jul2018 |
11 | 3002618 | 30Aug2018 |
11 | 3002619 | 11Oct2018 |
11 | 3002619 | 11Oct2018 |
11 | 3002621 | 03Jan2019 |
11 | 3002622 | 14Feb2019 |
11 | 3002623 | 28Mar2019 |
12 | 3002624 | 01Jun2018 |
12 | 3002625 | 01Feb2019 |
12 | 3002626 | 01Sep2019 |
12 | 3002627 | 20Dec2019 |
13 | 3002628 | 01Mar2018 |
13 | 3002629 | 24May2018 |
13 | 3002630 | 16Aug2018 |
13 | 3002631 | 08Nov2018 |
13 | 3002632 | 31Jan2019 |
13 | 3002633 | 25Apr2019 |
13 | 3002634 | 18Jul2019 |
13 | 3002635 | 10Oct2019 |
13 | 3002635 | 10Oct2019 |
13 | 3002637 | 26Nov2019 |
13 | 3002638 | 20Dec2019 |
14 | 3002639 | 05Jan2018 |
14 | 3002640 | 30Mar2018 |
14 | 3002641 | 22Jun2018 |
14 | 3002642 | 14Sep2018 |
15 | 3002643 | 01Mar2018 |
15 | 3002644 | 03May2018 |
15 | 3002645 | 05Jul2018 |
15 | 3002646 | 06Sep2018 |
15 | 3002647 | 08Nov2018 |
15 | 3002648 | 10Jan2019 |
15 | 3002649 | 14Mar2019 |
15 | 3002650 | 16May2019 |
15 | 3002650 | 16May2019 |
15 | 3002652 | 19Sep2019 |
15 | 3002653 | 15Oct2019 |
15 | 3002654 | 26Nov2019 |
15 | 3002654 | 26Nov2019 |
16 | 3002656 | 01Feb2019 |
16 | 3002657 | 01May2019 |
16 | 3002658 | 01Aug2019 |
16 | 3002659 | 01Nov2019 |
17 | 3002660 | 05Jan2018 |
17 | 3002661 | 01Apr2018 |
18 | 3002662 | 01May2018 |
18 | 3002663 | 01Sep2018 |
18 | 3002664 | 01Dec2018 |
19 | 3002665 | 01Apr2018 |
19 | 3002666 | 31May2018 |
19 | 3002667 | 30Jul2018 |
19 | 3002668 | 28Sep2018 |
19 | 3002669 | 27Nov2018 |
19 | 3002670 | 26Jan2019 |
20 | 3002671 | 02Jan2018 |
20 | 3002672 | 13Feb2018 |
20 | 3002673 | 27Mar2018 |
20 | 3002674 | 08May2018 |
20 | 3002675 | 19Jun2018 |
20 | 3002676 | 31Jul2018 |
20 | 3002677 | 11Sep2018 |
20 | 3002678 | 23Oct2018 |
20 | 3002679 | 04Dec2018 |
20 | 3002680 | 15Jan2019 |
20 | 3002682 | 09Apr2019 |
20 | 3002682 | 09Apr2019 |
data patients;
infile datalines dlm=',';
input PAT_ID CSN CONTACT_DATE : DATE. ;
format CONTACT_DATE DATE9.;
datalines;
1,3004537,05Jan2018
1,3006970,01Feb2018
1,3004089,01Mar2018
1,3004089,01Mar2018
1,3003421,01Dec2018
1,3004775,05Jan2019
1,3002783,01May2018
2,3004235,01Feb2018
2,3004415,01Apr2018
2,3006463,01Aug2018
2,3005423,01Feb2019
3,3007317,01Jul2018
3,3005789,01Oct2018
3,3008867,01May2019
3,3008867,01May2019
4,3008268,01Apr2018
5,3001485,01Mar2018
5,3007350,01Nov2018
5,3001745,01Feb2019
5,3008913,01Dec2019
6,3002585,01Oct2018
6,3002586,01Nov2018
6,3002586,01Nov2018
6,3002588,05Jan2019
6,3002588,05Jan2019
6,3002590,01Nov2019
6,3002590,01Nov2019
6,3002593,01Dec2019
6,3002593,01Dec2019
7,3002594,01Aug2019
7,3002595,01Aug2019
8,3002596,01Jan2018
8,3002597,22Jan2018
8,3002598,12Feb2018
8,3002599,05Mar2018
8,3002599,05Mar2018
8,3002601,16Apr2018
8,3002602,07May2018
8,3002603,28May2018
8,3002604,18Jun2018
8,3002605,09Jul2018
8,3002605,09Jul2018
9,3002607,08May2018
9,3002608,05Jan2019
9,3002609,01Nov2019
10,3002610,05Jan2019
10,3002611,01Jun2019
10,3002612,30Dec2019
11,3002613,01Feb2018
11,3002614,15Mar2018
11,3002615,26Apr2018
11,3002615,26Apr2018
11,3002617,19Jul2018
11,3002618,30Aug2018
11,3002619,11Oct2018
11,3002619,11Oct2018
11,3002621,03Jan2019
11,3002622,14Feb2019
11,3002623,28Mar2019
12,3002624,01Jun2018
12,3002625,01Feb2019
12,3002626,01Sep2019
12,3002627,20Dec2019
13,3002628,01Mar2018
13,3002629,24May2018
13,3002630,16Aug2018
13,3002631,08Nov2018
13,3002632,31Jan2019
13,3002633,25Apr2019
13,3002634,18Jul2019
13,3002635,10Oct2019
13,3002635,10Oct2019
13,3002637,26Nov2019
13,3002638,20Dec2019
14,3002639,05Jan2018
14,3002640,30Mar2018
14,3002641,22Jun2018
14,3002642,14Sep2018
15,3002643,01Mar2018
15,3002644,03May2018
15,3002645,05Jul2018
15,3002646,06Sep2018
15,3002647,08Nov2018
15,3002648,10Jan2019
15,3002649,14Mar2019
15,3002650,16May2019
15,3002650,16May2019
15,3002652,19Sep2019
15,3002653,15Oct2019
15,3002654,26Nov2019
15,3002654,26Nov2019
16,3002656,01Feb2019
16,3002657,01May2019
16,3002658,01Aug2019
16,3002659,01Nov2019
17,3002660,05Jan2018
17,3002661,01Apr2018
18,3002662,01May2018
18,3002663,01Sep2018
18,3002664,01Dec2018
19,3002665,01Apr2018
19,3002666,31May2018
19,3002667,30Jul2018
19,3002668,28Sep2018
19,3002669,27Nov2018
19,3002670,26Jan2019
20,3002671,02Jan2018
20,3002672,13Feb2018
20,3002673,27Mar2018
20,3002674,08May2018
20,3002675,19Jun2018
20,3002676,31Jul2018
20,3002677,11Sep2018
20,3002678,23Oct2018
20,3002679,04Dec2018
20,3002680,15Jan2019
20,3002682,09Apr2019
20,3002682,09Apr2019
;
I am able to count the number of distinct CSNs per patient and count the number of visits per patient per month but I get lost when trying to calculate the average visit per patient per month.
proc sql;
create table PAT_Visits as
select PAT_ID, CSN, CONTACT_DATE, count(distinct CSN) as Visits
from Patients
group by PAT_ID;
quit;
proc summary data=patients nway;
class PAT_ID CSN CONTACT_DATE ;
output out=counted(drop=_type_ rename=(_freq_ = count));
run;
proc print data=counted;
run
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @PharmlyDoc
Taking the visit per patient per month during the 24 months period would be (the total number of visits in 24 months)/24. The following code would give this. The variable TOTAL_VISITS gives the total number of visits during the 24 months period and .AV_VISITS gives the average number of visit per month during the 24 months period.
I assumed that the data set provided in this question represents 24 months of data.
proc sql;
create table AV_VISITS as
select PAT_ID,count(distinct CSN) as TOTAL_VISITS, count(distinct CSN)/24 as AV_VISITS format 5.3 from patients
group by pat_id;
quit;
The AV_VISITS data set will be like this
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
How is CSN actually involved in this? Since CSN is per date you really don't want that in calculating either a total or a mean.
If the month and patient counts per month are needed this would be my take:
proc freq data=patients noprint; tables pat_id *contact_date / out=patmonthcount; format contact_date yymon5.; run; proc summary data=patmonthcount nway; class pat_id; var count; output out=meanvisitsmonth (drop=_type_) mean(count)=meanvisits; run;
Or are we supposed to use 24 somewhere?
A format applied to a variable will create groups that are honored by most procedures, which I use in proc freq to get monthly counts. The average counts of the months with visits.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Are you saying that you do not include months with zero visits in generating a month average? Otherwise, if the time span is 24 months, you merely need to divide the number of unique visits by 24.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @PharmlyDoc
Taking the visit per patient per month during the 24 months period would be (the total number of visits in 24 months)/24. The following code would give this. The variable TOTAL_VISITS gives the total number of visits during the 24 months period and .AV_VISITS gives the average number of visit per month during the 24 months period.
I assumed that the data set provided in this question represents 24 months of data.
proc sql;
create table AV_VISITS as
select PAT_ID,count(distinct CSN) as TOTAL_VISITS, count(distinct CSN)/24 as AV_VISITS format 5.3 from patients
group by pat_id;
quit;
The AV_VISITS data set will be like this