BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Csands
Calcite | Level 5

Hi.

I am running a logistic regression with a binary dependent variable and 5 class independent variables.

The used code is:

proc logistic data=train;

class var1 var2 var3 var4 var5 / param=GLM;

model pred12 (event='2')= var1 var2 var3 var4 var5 / RSQ;

run;

And the partial output is:

EffectDFWaldPr > ChiSq
Chi-Square
Var11150,7266<.0001
Var23119,5550<.0001
Var38157,9586<.0001
Var461553,0700<.0001
Var5415975,6288<.0001

Analysis of Maximum Likelihood Estimates
Parameter DFEstimateStandardWaldPr > ChiSq
ErrorChi-Square
Intercept 15,00540,2974283,3322<,0001
.--------.----.------.--------.--------.--------.--------
.--------.----.------.--------.--------.--------.--------
Var411-1,94430,296942,8854<,0001
Var421-1,69710,29632,8692<,0001
Var431-0,90090,29519,31970,0023
Var441-1,01160,295711,70650,0006
Var451-0,45240,29632,33190,1267
Var4610,02550,30390,00710,933
Var499900...
Var511-1,24420,0445782,4054<,0001
Var521-1,54830,03641811,5691<,0001
Var531-2,1080,03044793,1049<,0001
Var541-3,13940,025914693,0821<,0001
Var599900...

My question is concerning the wald chi-square. How can I interpret the fact that for var5 the value of the wald chi-square is so much higher than the values of the remaining variables?  And what is the consequence to the regression quality.

Thanks in advance for the help.

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

Think of how a Wald chi-square is calculated = (estimate/stderr)**2.  In a way, it is an effect size squared.  Look at the estimates--large for variable 5 compared to the reference category, with realtively small standard errors.  It looks like they have a large effect.  You could add the type3 option to your model statement to get overall tests of marginal differences in the levels of each of the variables.  On the other hand, do not give undue weight to the chi-squared values for each level of a variable.  As far as regression "quality", check the INFLUENCE and LACKFIT options for the model statement.

Steve Denham

View solution in original post

6 REPLIES 6
SteveDenham
Jade | Level 19

Think of how a Wald chi-square is calculated = (estimate/stderr)**2.  In a way, it is an effect size squared.  Look at the estimates--large for variable 5 compared to the reference category, with realtively small standard errors.  It looks like they have a large effect.  You could add the type3 option to your model statement to get overall tests of marginal differences in the levels of each of the variables.  On the other hand, do not give undue weight to the chi-squared values for each level of a variable.  As far as regression "quality", check the INFLUENCE and LACKFIT options for the model statement.

Steve Denham

Csands
Calcite | Level 5

Thank you for your very helpful answer.

Since your answer I studied more this issues, and tried some of the options you suggested.

One of my main issues is to explain to the users the impact of a variable (and in this study is crucial due the large effect of variable 5). I already saw this impact expressed as a rate in a doc  (for example if I am predicting vardep {0,1} with two independent variables, then var1 is said to represent 20% and var2 represent 80% of the  probability to achieve vardep=1) but it didn't show how were the rates calculated. Do you have any idea on this?

About the options suggested, I couldn't find the type3 option in the model statement in the sas support documentation

http://support.sas.com/documentation/cdl/en/statug/65328/HTML/default/viewer.htm#statug_logistic_syn... , can you help me on that.

With the lackfit option I had no problem but with the influence option, I couldn't run it because my database is very large, is there any way to save this information to a sas dataset and not to the SAS Output?


Thanks again.

C.

SteveDenham
Jade | Level 19

I apologize.  The type3 option is available in GENMOD, which is where I do most of my fixed effect logistic modeling.  I am sorry that I pointed you at the wrong option.  Instead, if you are on SAS/STAT12.1, you should look at the EFFECT statement.

effect variable2=multimember (var2):

However, I can't seem to pin down the documentation--in the LOGISTIC Procedure documentation it does say "is a multimember classification effect whose levels are determined by one or more variables that appear in a CLASS statement," so I believe it should give an overall test.  You may need more than one EFFECT statement, or perhaps you can add the others into a single statement.

I think the only way to get influence option to a dataset would be to use ODS.  You would probably have to close the output listing destination, and output everything of interest, or maybe use ODS select to display only the tables of interest.  This assumes that the cause of not running is overflow of the output destination.  If it is a memory problem, I really don't know what to offer--perhaps others can help.

Steve Denham

Csands
Calcite | Level 5

Don't apologize you where a big help.

Concerning the influence option, you are right with the statement ' ods output influence=<table name>; ' the SAS produces a table. The problem is that it keeps writing in the SAS output the table (in this case 200.000 observations), and if I use the option noprint in the Proc logistic statement the ods statement is ignored. 

Thanks.

C.

SteveDenham
Jade | Level 19

NOPRINT turns off ODS output as well as the listing.

I suggest, if you want to suppress all output:

ods listing close;

<insert PROC LOGISTIC code here, including the ODS output statement>

ods listing;

If you only want to suppress the influence ouput, you could try:

ods exclude influence;

ods output influence=<table name goes here>;

I hope these work for you.

Steve Denham


Csands
Calcite | Level 5

Thanks Steve.

The code below totally solves my problem.

ods exclude influence;

proc logistic data=train;

class var1 var2 var3 var4 var5 / param=GLM;

model pred12 (event='2')= var1 var2 var3 var4 var5 / RSQ influence lackfit;

ods output influence=estatinfluence;

run;

To suppress all output I think the statements are:

ods html close;

....

ods html;

Thanks for all your help.

C.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 11837 views
  • 0 likes
  • 2 in conversation