Solved: Compare means of 3 different groups

Chi · Posted 08-09-2012 09:47 AM

Hello everyone,

I have two continuous variables: SBP(dependent) and LDL(independent variables), and if SBP has been divided into 3 tertiles. Using AVOVA in this case doesn't seem to be appropriate, does it?

How can I compare the means of LDL between these 3 tertiles?

Thanks.

SteveDenham · Posted 08-09-2012 02:32 PM

Well, at least in some sense, it does, but I wouldn't do it unless I absolutely had to.

The problem here is that the causation direction is opposite the independent-dependent direction we ordinarily think of in designed experiments. Do you absolutely need a test and associated p value? If so, then consider the tertiles as groups and use analysis of variance to get the p values. There is a fair amount of literature out there, especially in the survey fields, where one variable is split into quintiles, and then used as a grouping variable to analyze other variates.

Another approach would be to construct simultaneous nonparametric confidence intervals on the medians, and make inferences from that. I would prefer this, as I really don't think the distributions of pollutant are going to be real similar from tertile to tertile--I expect the first tertile to be left skewed, the middle to by platykurtotic, and the third to be right skewed.

But I still think plotting the data, and then developing a regression relationship (linear, polynomial, nonlinear) would tell you a lot more than if arbitrary groups have different means for the variable that may cause them to have different means.

Steve Denham

View solution in original post

mkeintz · Posted 08-09-2012 09:58 AM

You have an ordinal reponse variable and continuous predictors - so I would look at proc logistic.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SteveDenham · Posted 08-09-2012 11:43 AM

I am confused. If SBP has three classes after dividing into tertiles, then it should be the independent variable, with LDL the dependent variable. ANOVA is a good method for comparing the means, although I would worry about homogeneity of variance more here than normal (which means maybe a little in my case, as opposed to not very much at all worried).

But why split into tertiles, and then check for differences in means? What does a plot of the raw SBP and LDL values show? Would a regression tell you more than simply comparing some means after a rather arbitrary split, entirely dependent on the make-up of your sample?

Steve Denham

Chi · Posted 08-09-2012 12:40 PM

Hi Steve,

Thanks for your reply. Actually my study is focusing on whether air pollutant A can cause increase thickness of carotid artery IMT. My boss wants to know if we sperate the intima-media thickness of carotid artery (our outcome variable) into tertiles, will we observe different levels of pollutant density(independent variable)? The problem is that both IMT and pollutants are continuous variables, so it doesn't make sense to use ANOVA after IMT has been split into 3 tertiles.

Chi

SteveDenham · Posted 08-09-2012 02:32 PM

Well, at least in some sense, it does, but I wouldn't do it unless I absolutely had to.

The problem here is that the causation direction is opposite the independent-dependent direction we ordinarily think of in designed experiments. Do you absolutely need a test and associated p value? If so, then consider the tertiles as groups and use analysis of variance to get the p values. There is a fair amount of literature out there, especially in the survey fields, where one variable is split into quintiles, and then used as a grouping variable to analyze other variates.

Another approach would be to construct simultaneous nonparametric confidence intervals on the medians, and make inferences from that. I would prefer this, as I really don't think the distributions of pollutant are going to be real similar from tertile to tertile--I expect the first tertile to be left skewed, the middle to by platykurtotic, and the third to be right skewed.

But I still think plotting the data, and then developing a regression relationship (linear, polynomial, nonlinear) would tell you a lot more than if arbitrary groups have different means for the variable that may cause them to have different means.

Steve Denham

Ksharp · Posted 08-09-2012 10:06 PM

Is it balanced experiment design? that means very group(SBP) has the same number of obs,then you can use

ANOVA , but remember it is based on Normal Distribution.

Whereas,if it is non-balanced experiment design (has not the same number of obs ) , maybe you should use proc glm , but that is a hard way to analysis and to illustrate analysis result . As we all know it is SS1 versus SS3

Ksharp

Compare means of 3 different groups

Re: Compare means of 3 different groups

Re: Compare means of 3 different groups

Re: Compare means of 3 different groups

Re: Compare means of 3 different groups

Re: Compare means of 3 different groups

Re: Compare means of 3 different groups

Catch up on SAS Innovate 2026

Compare means of 3 different groups

Re: Compare means of 3 different groups

Re: Compare means of 3 different groups

Re: Compare means of 3 different groups

Re: Compare means of 3 different groups

Re: Compare means of 3 different groups

Re: Compare means of 3 different groups

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away