BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
datalligence
Fluorite | Level 6
Hi,

I am coming across a lot of cases where people are talking about, or using T-Test when they are comparing campaign response rates, membership renewal rates, etc.

My understanding is that the T-Test is not appropriate for such cases. And that people are confusing the T-Test with the Two Sample test for proportions (which uses the Z stat). Personally speaking, I think the Chi Square test and its related tests (Fisher's Exact, Mc Nemar) are more appropriate for testing the differences in proportions/ratios.

There is one case where people have actually used a Proc GLM to test the difference in renewals among customer groups based on different communication channels. So how can Proc GLM or the t-test which are meant for continuous dependent values be used for comparing proportions/ratios?

Thanks
1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19
Linear models are amazingly robust to distributional assumptions, so long as the sample size is relatively large, and you aren't near the points where the distributional assumption breaks down. For instance, try the code below. Even when the total sample size is ten per group, GLM does a pretty good job of giving a p-value close to that for the chi squared type stats. Only the exact test for small proportions is grossly different in p value, and under these conditions, I wouldn't trust chi squared either.

If it came down to detecting small changes, or the proportions were near 0 or 1, then the distributional assumptions make a difference in the final conclusion, and more powerful and appropriate methods need to be applied. And personally, I hate making an assumption I know is wrong. But the Central Limit Theorem, Law of Large Numbers and Tchebychev's theorem all seem to protect the unwary.

Code for comparing:

data one;
input grp response weight1 weight2 weight3 weight4;
cards;
1 0 40 4 4 1
1 1 60 96 6 9
2 0 60 8 6 2
2 1 40 92 4 8
;

proc freq data=one;
table grp*response/all exact;
weight weight1;
run;

proc freq data=one;
table grp*response/all exact;
weight weight2;
run;

proc freq data=one;
table grp*response/all exact;
weight weight3;
run;

proc freq data=one;
table grp*response/all exact;
weight weight4;
run;
proc glm data=one;
class grp;
model response=grp;
freq weight1;
run;

proc glm data=one;
class grp;
model response=grp;
freq weight2;
run;

proc glm data=one;
class grp;
model response=grp;
freq weight3;
run;

proc glm data=one;
class grp;
model response=grp;
freq weight4;
run;
quit;

Good luck.

Steve Denham

View solution in original post

13 REPLIES 13
SteveDenham
Jade | Level 19
Linear models are amazingly robust to distributional assumptions, so long as the sample size is relatively large, and you aren't near the points where the distributional assumption breaks down. For instance, try the code below. Even when the total sample size is ten per group, GLM does a pretty good job of giving a p-value close to that for the chi squared type stats. Only the exact test for small proportions is grossly different in p value, and under these conditions, I wouldn't trust chi squared either.

If it came down to detecting small changes, or the proportions were near 0 or 1, then the distributional assumptions make a difference in the final conclusion, and more powerful and appropriate methods need to be applied. And personally, I hate making an assumption I know is wrong. But the Central Limit Theorem, Law of Large Numbers and Tchebychev's theorem all seem to protect the unwary.

Code for comparing:

data one;
input grp response weight1 weight2 weight3 weight4;
cards;
1 0 40 4 4 1
1 1 60 96 6 9
2 0 60 8 6 2
2 1 40 92 4 8
;

proc freq data=one;
table grp*response/all exact;
weight weight1;
run;

proc freq data=one;
table grp*response/all exact;
weight weight2;
run;

proc freq data=one;
table grp*response/all exact;
weight weight3;
run;

proc freq data=one;
table grp*response/all exact;
weight weight4;
run;
proc glm data=one;
class grp;
model response=grp;
freq weight1;
run;

proc glm data=one;
class grp;
model response=grp;
freq weight2;
run;

proc glm data=one;
class grp;
model response=grp;
freq weight3;
run;

proc glm data=one;
class grp;
model response=grp;
freq weight4;
run;
quit;

Good luck.

Steve Denham
datalligence
Fluorite | Level 6
Thank you Steve.

So my guess is Proc GLM uses the Z test for proportions. I also did a chi-square test on renewal rates between 2 groups, and a 2-sample test for proportions (using Z stat). The p-values came out very similar.

My question or doubt is, why or when is the Z-test for proportions called the T-test? According to the SAS online doc, "PROC GLM handles models relating one or several continuous dependent variables to one or several independent variables." So how does it handle a categorical/binomial dependent variable? Which test or statistics does it use when the dependent variable is binomial (Yes/No, 0/1)? And under what assumptions does it equate the categorical dependent variable to a continuous dependent variable?

Thanks again.
SteveDenham
Jade | Level 19
GLM always treats variables as continuous and as coming from a normal distribution. It doesn't use a Z test. The Z test assumes that you have a known variance, whereas a t test, and linear models in general, uses the sample variance as an estimator. In answer to your question, "how does it handle categorical/binomial dependent variables", the short answer is: It ignores the fact that the variable is categorical or binomial. All responses are treated as continuous. For binomial responses, we have seen that this isn't too bad in a lot of cases, because we can sort of rank a yes/no response. For true categorical variables, such as product brands, or various politicians, this can't really be done, and GLM is likely to give bad results.

SAS has other procedures that are more appropriate for these sorts of distributions--LOGISTIC, GENMOD, GLIMMIX--that use the tools of linear models and recognize the distribution of the outcome variables. But these are "newer", and utilize methods that usually are not covered in intro stat courses. Consequently, GLM or TTEST are the tools that people have seen. And to quote a famous proverb, "When the only tool you have is a hammer, every problem looks like a nail." People go around hammering--sometimes the results aren't so bad, sometimes you break the crockery.
datalligence
Fluorite | Level 6
Thank you so much Steve!

I have also found out how the Z test for proportions is very very similar to the "equal variances not assumed" version of Student's t test for independent samples when you have a binomial dependent variable (0/1).

Gonna write about this on my blog. Thanks again.

http://datalligence.blogspot.com/
jballach
Calcite | Level 5


Is PROC GLM or TTEST appropriate for testing the differences in proportions/ratios of ordinal variables, such as Likert scale/survey variables?

Thank you very much,

John Ballach

SteveDenham
Jade | Level 19

Not really.  Of the linear modeling techniques, look at GENMOD or GLIMMIX to handle proportions.  The question I have is what meaning does a ratio of Likert variables have?  Just for example, suppose in example 1, scale 1 returns a 1 and scale 2 returns a 2.  The ratio is 0.5.  In example 2, scale 1 returns a 2 and scale 2 returns a 4.  The ratio is 0.5.  Same ratio--but a very different meaning.  Thus a bit more background on the data and your research question are needed before I can make an educated guess as to how I would analyze the data.

Steve Denham

jballach
Calcite | Level 5

Thanks for the reply and question, Steve!

I work on a Customer Satisfaction team which reports/analyzes survey results for a major bank. All of our reports contain Top2 Box, and Bottom4 Box, scores for survey questions using the 10 pt Likert scale. A Top 2 Box score is a proportion and calculated as follows: (the count of responses to a particular survey question of a 9 or 10) divided by (the count of all responses to the same survey question).

For instance, let's say we have test and control based upon an IVR option which results in a group of customers selecting YES or NO to some question. And let's say we survey these customers, but we are only concerned with one of the survey questions, Q9, which is as follows: "Would you recommend our Company to a friend or colleague? Please use a scale of 1 to 10, where 1 is 'Definitely Not' and 10 is 'Definitely'" And then let's say we have the following data from the survey:

GROUPQ9
RESPONSE
COUNT
Q9
RESPONSE
OF
9 OR 10
COUNT
Q9
TOP 2
BOX
N1176152%
Y1458156%

What is the best way using SAS for me to determine if the difference between the Q9 Top 2 Box scores for the two groups in the table, above, are significant?

Again, thank you very much!

John

SteveDenham
Jade | Level 19

Quick easy method is PROC FREQ.

data one;

input group $ response weight;

datalines;

N 0 56

N 1 61

Y 0 64

Y 1 81

;

proc freq data=one;

tables group*response/all;

weight weight;

run;

gives me a p value of 0.5474, so I would say that there is no evidence for a difference between the proportions marking the top 2 levels (.5214 for N, .5586 for Y).

This approach is different from calculating the average Likert score, or the ratio of two specific scores.  It just dichotomizes the response.

Now if your data came from a sample from a known population of a given size, you might want to consider PROC SURVEYFREQ, to account for finite sampling adjustments.

Steve Denham

jballach
Calcite | Level 5

How  did you calculate the weights, Steve? Thanks.

John

SteveDenham
Jade | Level 19

Weights are the number of observations in each category.  For instance, for group N, you have 117 responses, of which 61 were 'Top 2".  That leaves 56 as not Top 2.  I coded Top 2 as 1, and not Top 2 as 0 for the variable response.

Steve Denham

jballach
Calcite | Level 5

Thank you very much. This has been insightful.

John Ballach

ballardw
Super User
> My question or doubt is, why or when is the Z-test
> for proportions called the T-test?


I suspect laziness or a first semester stat class vaguely remembered with the only thing remembered was 'T-test'.
Yazdan
Fluorite | Level 6

I also have same problem. Please consider my example: I have a table for the number of incidents occurred in two-hour blocks (Ex. 0-2, 2-4, etc). The data for each block recorded over 4 years and it is heavily inflated with zeros. So I have a table like:

 

Day                     [0-2]         [2-4]        [4-6]    .......   [10-0]

Match 1                  0             1             1       .......      13         

March 2                  1             2             0       ........       2  

.

.

.

.

March 30              0            10             0                      2

 

 

How can I compare the proportion of the number of attacks occurred during different time blocks in March? Please give me a hand in coding as well. Thank you!

 

 

Yazdan

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 13 replies
  • 15039 views
  • 4 likes
  • 5 in conversation