Re: Multiple groups comparison

Sujithpeta · Posted 01-25-2020 01:03 PM

Hey,

I need help in figuring out which test/best practice is better for my problem. I've data with 4 different drug group cost and rate of various events (ex: admissions, adverse drug reactions etc.), every outcome is continuous data.

I want to say that drug A is cheaper than drug B,C and D, not sure if ANOVA is helpful here as it only say if any group is different from all groups (correct me if I'm wrong). Another problem is I want to refrain from using multiple t-tests (A to B, A to C and A to D) as I feel it's not best practice as you have to do adjustment for 5% error for every t-test comparison and I've to show this in the paper with a new table which would take up space.

Appreciate your help. Thanks

PGStats · Posted 01-25-2020 03:00 PM

If a linear model doesn't violate any ANOVA assumptions, consider the MEANS statement in proc glm with the DUNNETTU correction (for unilateral comparisons) for multiple testing.

PG

PaigeMiller · Posted 01-25-2020 04:35 PM

@Sujithpeta wrote:

I want to say that drug A is cheaper than drug B,C and D, not sure if ANOVA is helpful here

"Cheaper"? I don't think that's something you would show statistically.

--
Paige Miller

Sujithpeta · Posted 01-28-2020 09:55 AM

If drug A mean cost is less than all other group and shows statistical significance difference between all other drug group should mean the drug A is cheaper with statistical significance? Correct me if I'm wrong.

PaigeMiller · Posted 01-28-2020 10:10 AM

@Sujithpeta wrote:

If drug A mean cost is less than all other group and shows statistical significance difference between all other drug group should mean the drug A is cheaper with statistical significance? Correct me if I'm wrong.

Perhaps I'm not as familiar with drug testing as you are, but cost is normally not a random variable, it is fixed, so you wouldn't perform statistical testing on it. And if you perform the statistical testing on the rate of adverse events of the study, that says nothing about cost. So I don't see "cheaper with statistical significance" as a meaningful phrase.

--
Paige Miller

PGStats · Posted 01-28-2020 03:08 PM

What is your sampling frame for drug prices?

PG

Sujithpeta · Posted 01-29-2020 08:49 PM

Pancreatic cancer patients are identified from medical claims data and I'm analyzing the outcomes by pancreatic cancer drug type.

Did I answer your question?

PGStats · Posted 01-29-2020 11:14 PM

It is still not clear to me what kind of relationship you are analysing between cancer treatment outcome, drug type, and drug price.

PG

Sujithpeta · Posted 01-30-2020 08:32 AM

Pancreatic cancer patients are tracked for 1 year and patients get treated by different types of cancer drugs depending either on their clinical condition or physician hospital treatment pathway. We tracked the 1 year patient cost (i.e the cost insurance paid for patients treatment for 1 year), 1 year adverse event rates, 1 year inpatient, ER etc. admissions.

When we compare these outcomes by drug type, we look at the average cost is less for patient receiving drug A compared to others or patients receiving drug C has less ER admissions. At this point we compared just using the aggregate values but we want to say that statistically they are significant or not of what we are seeing in the data.

I hope this give you the picture.

StatDave · Posted 01-27-2020 11:37 AM

Regarding your outcomes: if an outcome "rate" is really a proportion (binary for each individual subject and therefore constrained to be between 0 and 1) then you need to use a binary response model. If the values are continuous because they are proportions accumulated over many subjects and you have the numerator and denominator counts of each proportion, then you can use the events/trials syntax in the MODEL statement of PROC LOGISTIC. See discussion and examples in the PROC LOGISTIC documentation. If the outcome is truly a "rate" in the sense that it is a count of events in some exposure size and could theoretically or actually exceed 1, then you need to fit a count response model in PROC GENMOD with an OFFSET= variable that is the log of the exposure size. See the discussion and examples in the GENMOD documentation. In either procedure, you can specify the LSMEANS statement to make comparisons among the levels of the CLASS predictor (drugs). Use the DIFF option to produce the comparisons. You can also use the ADJUST= option to adjust the p-values for the multiple testing.

SAS Innovate 2025: Call for Content