Contributor
Posts: 50

# Procedure for comparing means

I have a data-set that contains the # of sales("count") by time period ("period" - 1=first 6 months of year, 2 = last 6 mo's) by vendor ("group" - 4 vendors labeled 1,2,3, and 4).

What I'm trying to figure out is how different the variability is for each group between the first and last periods. I was wondering if there is a procedure that would take my data, create multiple samples from each group, and measure and record the means from each of the samples, by group and period,  so I could then compare those sample means to say something like "Group 1 has a standard deviation of 1% between period 1 and 2, and Group 4 has a SD of 20%, which shows that Group 4's sales over 2 periods can fluctuate much more than Group 1's".

Any help is greatly appreciated!

Here's a basic data-set to give you an idea of how mine looks, I tried to design it so that Group 4 has a larger variation between periods 1 and 2 ...

data a;

input period group count;

datalines;

1 1 10

1 1 15

1 1 7

1 2 12

1 2 15

1 2 18

1 3 15

1 3 20

1 3 22

1 4 20

1 4 25

1 4 30

2 1 12

2 1 18

2 1 5

2 2 15

2 2 10

2 2 18

2 3 16

2 3 22

2 3 24

2 4 60

2 4 65

2 4 70

;

Posts: 2,655

## Re: Procedure for comparing means

I may be shooting gnats with a 40 inch cannon, but I see this as a perfect example of testing for homogeneity of variance in PROC GLIMMIX.  The dependent variable is a count, so it most likely follows a Poisson distribution, the measures on the vendors are repeated, and so would likely be correlated.

proc glimmix data=a;

class period group;

nloptions tech=nrridg;

model count=period/dist=poi ddfm=kr2;

random _residual_/group=group  cl solution;

covtest homogeneity;

run;

The test of covariance parameters shows that at least one group had a value not equal to the other groups.  The covariance parameter estimates are the variances of the groups, comparing period 1 to period 2.  This can be generalized to more periods and groups if desired.

Steve Denham

Contributor
Posts: 50

## Re: Procedure for comparing means

Thanks for the quick response! I'm new to the GLIMMIX procedure, and just want to make sure I am reading the output correctly.

When I read the table below, is it fair to make the statement that Group 2 has the lowest 'count' variability over the 2 periods. And how would I express that using these #'s. For example, could I say that Group 4 is 40 times more variable than Group 3?

Posts: 2,655

## Re: Procedure for comparing means

I assume you could say that.  Ratios of variances are often reported.  The big "BEWARE' here is that all of the estimates have fairly large standard errors, and if you constructed confidence intervals around the estimates (using chi-squared quantiles) you could have a really, really large range of possible ratio values.  I would just say that for these data, Group 4 is much more variable than any of the other groups, and not try to quantify it, provided I reported the values in the table.

Steve Denhm

Discussion stats
• 3 replies
• 207 views
• 6 likes
• 2 in conversation