Quartz | Level 8

## Means procedure

Hi folks,

I have a data set where each ID has multiple visits over time and I have 7 time points for which I need to calculate mean number of visits by ID. I have tried several codes and asked the question on this forum but couldn't get an answer. Can someone help please?

Ta

My data looks like;

ID      status             eventdate      age      sex

1       Eligible            2016-04-14   73        male

1       Eligible            2017-02-12   73        male

1       Eligible            2018-09-03   73        male

1       Eligible            2016-01-15   73        male

1       Eligible            2016-11-17   73        male

2      Intervention      2018-01-15   73        male

2      Intervention      2016-01-12   73        male

2      Intervention      2017-04-15   73        male

2      Intervention      2018-09-22   73        male

2      Intervention      2018-07-26   73        male

2      Intervention      2018-03-15   73        male

3     Intervention       2017-02-13   73        male

3      Intervention      2017-01-12   73        male

Based on eventdate, I have 7 time points; arbitrarily;

if EventDate<= '28FEB2016'd then delete;*for the sake of uniformity in number of pre- and post- time periods, deleting visits before March 2016;
else if '01MAR2016'd <=EventDate<= '31AUG2016'd then month2='-3';*from March 2016 to August 2016;
else if '01SEP2016'd <=EventDate<= '28FEB2017'd then month2='-2'; *From September 2016 to February 2017;
else if '01MAR2017'd <=EventDate<= '31AUG2017'd then month2='-1'; *From March 2017 to August 2017;
else if '01SEP2017'd <=EventDate<= '28FEB2018'd then month2='0'; *From September 2017 to February 2018- intervention period;
else if '01MAR2018'd <=EventDate<= '31AUG2018'd then month2='1'; *From March 2018 to August 2018- first six months after intervention;
else if '01SEP2018'd <=EventDate<= '28FEB2019'd then month2='2'; *From September 2018 to February 2019- second six months after intervention;
else if '01MAR2019'd <=EventDate<= '31MAY2019'd then month2='3'; *From March 2019 to May 2018- third time period after intervention but doesn't have the data till August;

Help much appreciated.

26 REPLIES 26
Diamond | Level 26

## Re: Means procedure

Show us the code you tried.

--
Paige Miller
Quartz | Level 8

## Re: Means procedure

So I have created a variable called 'month2' to get 7 time periods from the 'eventdate' variable and another variable called 'visit' to indicate one visit for each 'eventdate' then  wrote this code;

proc summary data=saltaire;
by status month2;
var visit;
output out=sum n=n;
run;

What I am not sure about is, how to take into account visits per number of IDs by status and month2.

Quartz | Level 8

## Re: Means procedure

forgot to mention, the code that I wrote only gives me a sum of visits during each time period but not a mean by number of ids in each time period.

Super User

## Re: Means procedure

If I interpret you correctly, you should run proc means once to get the sums per id and "month", and then a second time to get the averages of the sums per id.

Quartz | Level 8

## Re: Means procedure

Thanks Kurt, that's the problem, I don't know how to calculate mean by ID. Might be a simple step but it's because I've been struggling with it too much and my brain has given up probably.

Diamond | Level 26

## Re: Means procedure

Show us what you have tried.

--
Paige Miller
Quartz | Level 8

## Re: Means procedure

I have posted the code in my previous reply.

Thanks

S

Diamond | Level 26

## Re: Means procedure

@sks521 wrote:

I have posted the code in my previous reply.

Thanks

S

No, you posted code that computes the value of N for each value of STATUS and MONTH2.

We want to see the code that you are trying to use to compute the mean value of N.

--
Paige Miller
Quartz | Level 8

## Re: Means procedure

PROC MEANS DATA=saltaire
SUM MEAN MAXDEC=0 ;
VAR visit;
by pkid;
run;

the output looks like;

so this gives me sum of visits for each ID separately. How do I convert this into average fro each time period?

Quartz | Level 8

## Re: Means procedure

Mathematically speaking I am looking for an output like;

Average visits per ID for pre-intervention period= total visits/total number of ids (discrete ids)

Diamond | Level 26

## Re: Means procedure

@sks521 wrote:

PROC MEANS DATA=saltaire
SUM MEAN MAXDEC=0 ;
VAR visit;
by pkid;
run;

the output looks like;

so this gives me sum of visits for each ID separately. How do I convert this into average fro each time period?

Average visits per ID for pre-intervention period= total visits/total number of ids (discrete ids attending the hospital during that period)

For example;

average number of visits during pre-intervention;

Eligible= 4.44

Intervention= 5.66

Average number of visits during intervention;

eligible=5.33

Intervention= 6.66

Average number of visits post-intervention;

eligible=3.99

Intervention=6.22

Something like that.

If you want the results by time period, then you need to use a time period variable in PROC MEANS, which you haven't done. And if you want averages also by STATUS, you need to have STATUS in your PROC MEANS.

You also don't have a variable named VISIT, so you can't do any analysis on it. Perhaps you mean the variable N, which was computed in your first PROC MEANS?

Lastly, if you want results pre-intervention, during intervention or post-intervention, you need to create a variable with these values somehow.

So all of these elements are needed for in the code you to get an answer.

Something like this might work after your first PROC MEANS

``````proc format;
value period -1,-2,-3 = 'Pre-Intervention' 0='Intervention' 1,2,3='Post-Intervention';
run;

PROC MEANS DATA=saltaire;
class status month2;
VAR n;
format month2 period.;
output out=want mean=;
run;``````

--
Paige Miller
Quartz | Level 8

## Re: Means procedure

Sorry about confusion, I should've been more explicit.

I have created a variable to indicate my time periods in a separate step. I also did create a variable called 'visit' to indicate a visit. Have also tried the code you have suggested but it still doesn't solve the problem.

It doesn't give me average number of visits per discrete id for each time period by status

Super User

## Re: Means procedure

This is going nowhere.

1. input data for the means analysis (data step with datalines, see my example)
2. the expected result from that

Make sure you cover edge cases, so we don't have to do a lot of further back-and-forth.

Quartz | Level 8

## Re: Means procedure

pkid      visit     month2    status

1           1        -3             eligible

1           1        -3             eligible

1           1        -2             eligible

1           1         0             eligible

2           1        -1            intervention

2           1         2            intervention

2           1         3            intervention

2           1         0            intervention

3           1         2            intervention

3           1        -1           intervention

3           1        -2           intervention

3           1         0           intervention

3           1         0           intervention

3           1         3           intervention

Month2 indicates my time periods which are 7 in total.

I ran sql which you'd suggested previously;

proc sql;
create table average as
select pkid, avg(visitsum) as average
from (
select pkid, month2, count(*) as visitsum
from saltaire
group by pkid, month2
)
group by pkid;
quit;

it doesn't give me average number of visits per discrete id for each time period by 'status'.