Solved: What test should I use to correct for multiple testing for proc reg an...

GLO1 · Posted 06-15-2020 01:25 PM

Hi,

Could someone advice me on the following; I have done a lot of seperate regression analyses using proc reg and proc logistic. I would like to correct for doing so many tests, but I am not sure how to do this in my situation. Is there an option to simply include a statement into the code for each regression analysis that corrects for multiple testing? Or should I work with multtest? In that case, should I collect all p-values of each regression?

This is the format of the code I repeated quite some times:

ods graphics on;
proc logistic data = dataset plots=roc;
model y=x/EXPB CL RSQ;
effectplot;
run;
ods graphics off;

proc reg data = dataset;
model y2=x2;
run;

Thank you in advance!

SteveDenham · Posted 06-16-2020 02:54 PM

So in each of your REG and LOGISTIC runs you should be getting some p values (I don't know whether these regressions have multiple predictors or not, but for now let's assume that they are univariate). Using ODS output you can get these into datasets. Then it is just a matter of shaping the datasets, setting them together, and running PROC MULTTEST.

From REG, look at the tables named ANOVA (for a p value for the whole model) and ParameterEstimates (for p values for individual parameters). From LOGISTIC, look for ModelANOVA (for p values related to the whole model) and ParameterEstimates (for p values for individual parameters). After identifying the correct row, you need to keep the p value, and some identifier. That would give you a dataset that looks like the one in the example, where the identifier is called Test1, Test2, etc.

Then comes the difficult part (at least for me) and here following @PGStats advice is critical, selecting the appropriate adjustment method.

SteveDenham

View solution in original post

PaigeMiller · Posted 06-15-2020 02:01 PM

There's no reason to do "multiple testing" in the example code you provided. You have only one x-variable in each.

--
Paige Miller

GLO1 · Posted 06-15-2020 02:07 PM

@PaigeMiller Thank you for your response.

Yes that is true, however, I repeat the tests multiple times (30) so that increases the risk that there will be results that turn out significant but that aren't. Maybe just set the p-value to 0.01? What would you recommend?

PaigeMiller · Posted 06-15-2020 02:13 PM

You could use a Bonferroni adjustment.

https://en.wikipedia.org/wiki/Bonferroni_correction

--
Paige Miller

SteveDenham · Posted 06-15-2020 02:46 PM

You could output all of the p values from the REG and LOGISTIC runs, and then use PROC MULTTEST to adjust. See Example 83.5 Inputting Raw p Values here https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_multtest_examples05.htm&docsetVer...

You can choose many different methods for adjustment (there are 15 options for control of family wise error, but some cannot be used on a datafile of raw values).

I prefer a stepdown Sidak adjustment, but a case can be made for any of the methods.

SteveDenham

GLO1 · Posted 06-16-2020 09:17 AM

@SteveDenham

Thank you for your suggestion. I am trying to do it your way now, however, I cannot get the p-values of my REG and/or LOGISTICS together in one dataset. Any code suggestions on how to do so?

Thanks a lot!

SteveDenham · Posted 06-16-2020 02:54 PM

So in each of your REG and LOGISTIC runs you should be getting some p values (I don't know whether these regressions have multiple predictors or not, but for now let's assume that they are univariate). Using ODS output you can get these into datasets. Then it is just a matter of shaping the datasets, setting them together, and running PROC MULTTEST.

From REG, look at the tables named ANOVA (for a p value for the whole model) and ParameterEstimates (for p values for individual parameters). From LOGISTIC, look for ModelANOVA (for p values related to the whole model) and ParameterEstimates (for p values for individual parameters). After identifying the correct row, you need to keep the p value, and some identifier. That would give you a dataset that looks like the one in the example, where the identifier is called Test1, Test2, etc.

Then comes the difficult part (at least for me) and here following @PGStats advice is critical, selecting the appropriate adjustment method.

SteveDenham

PGStats · Posted 06-15-2020 02:44 PM

You should be using proc multtest. The choice of which tests to include in the set is up to you. Some facetious statisticians have suggested that every tests ever performed since the beginning of your career should be considered! I suggest to include every test that was relevant to answer a specific research question.

PG

What test should I use to correct for multiple testing for proc reg and proc logistic?

Re: What test should I use to correct for multiple testing for proc reg and proc logistic?

Re: What test should I use to correct for multiple testing for proc reg and proc logistic?

Re: What test should I use to correct for multiple testing for proc reg and proc logistic?

Re: What test should I use to correct for multiple testing for proc reg and proc logistic?

Re: What test should I use to correct for multiple testing for proc reg and proc logistic?

Re: What test should I use to correct for multiple testing for proc reg and proc logistic?

Re: What test should I use to correct for multiple testing for proc reg and proc logistic?

Re: What test should I use to correct for multiple testing for proc reg and proc logistic?

The 2025 SAS Hackathon has begun!