BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GLO1
Fluorite | Level 6

Hi,

 

Could someone advice me on the following; I have done a lot of seperate regression analyses using proc reg and proc logistic. I would like to correct for doing so many tests, but I am not sure how to do this in my situation. Is there an option to simply include a statement into the code for each regression analysis that corrects for multiple testing? Or should I work with multtest? In that case, should I collect all p-values of each regression?

 

This is the format of the code I repeated quite some times:

 

ods graphics on;
proc logistic data = dataset plots=roc;
model y=x/EXPB CL RSQ;
effectplot;
run;
ods graphics off;

proc reg data = dataset;
model y2=x2;
run;


Thank you in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

So in each of your REG and LOGISTIC runs you should be getting some p values (I don't know whether these regressions have multiple predictors or not, but for now let's assume that they are univariate).  Using ODS output you can get these into datasets.  Then it is just a matter of shaping the datasets, setting them together, and running PROC MULTTEST.

From REG, look at the tables named ANOVA (for a p value for the whole model) and ParameterEstimates (for p values for individual parameters).  From LOGISTIC, look for ModelANOVA (for p values related to the whole model) and ParameterEstimates (for p values for individual parameters).  After identifying the correct row, you need to keep the p value, and some identifier. That would give you a dataset that looks like the one in the example, where the identifier is called Test1, Test2, etc.

Then comes the difficult part (at least for me) and here following @PGStats advice is critical, selecting the appropriate adjustment method.

 

SteveDenham

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26

There's no reason to do "multiple testing" in the example code you provided. You have only one x-variable in each.

--
Paige Miller
GLO1
Fluorite | Level 6

@PaigeMiller Thank you for your response.

Yes that is true, however, I repeat the tests multiple times (30) so that increases the risk that there will be results that turn out significant but that aren't. Maybe just set the p-value to 0.01? What would you recommend?

SteveDenham
Jade | Level 19

You could output all of the p values from the REG and LOGISTIC runs, and then use PROC MULTTEST to adjust.  See Example 83.5 Inputting Raw p Values here https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_multtest_examples05.htm&docsetVer... 

 

You can choose many different methods for adjustment (there are 15 options for control of family wise error, but some cannot be used on a datafile of raw values).

I prefer a stepdown Sidak adjustment, but a case can be made for any of the methods.

 

SteveDenham

GLO1
Fluorite | Level 6

@SteveDenham 

Thank you for your suggestion. I am trying to do it your way now, however, I cannot get the p-values of my REG and/or LOGISTICS together in one dataset. Any code suggestions on how to do so?

 

Thanks a lot!

SteveDenham
Jade | Level 19

So in each of your REG and LOGISTIC runs you should be getting some p values (I don't know whether these regressions have multiple predictors or not, but for now let's assume that they are univariate).  Using ODS output you can get these into datasets.  Then it is just a matter of shaping the datasets, setting them together, and running PROC MULTTEST.

From REG, look at the tables named ANOVA (for a p value for the whole model) and ParameterEstimates (for p values for individual parameters).  From LOGISTIC, look for ModelANOVA (for p values related to the whole model) and ParameterEstimates (for p values for individual parameters).  After identifying the correct row, you need to keep the p value, and some identifier. That would give you a dataset that looks like the one in the example, where the identifier is called Test1, Test2, etc.

Then comes the difficult part (at least for me) and here following @PGStats advice is critical, selecting the appropriate adjustment method.

 

SteveDenham

PGStats
Opal | Level 21

You should be using proc multtest. The choice of which tests to include in the set is up to you. Some facetious statisticians have suggested that every tests ever performed since the beginning of your career should be considered! I suggest to include every test that was relevant to answer a specific research question.

PG

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 790 views
  • 5 likes
  • 4 in conversation