BookmarkSubscribeRSS Feed
D_Scott
Calcite | Level 5

Hi all

I am new to SAS and advanced statistics and am having problems understanding the result that a logistic regression is showing me.

I am doing this work for a job interview and thus desperately need some advice promptly.

I am analysing NFL trends for qualifying to the Play Offs

I would expect a stat called Combined EPA to be heavily correlated with qualifying for the NFL Playoffs.

When Qualifying on the Y axis and Combined EPA on the X axis I get the following result:

Fit Y by X.jpg

This appears to imply that as Combined EPA increases the probability of not qualifying increases, which is the opposite of my hypothesis

However I then plotted the graph with the axes switched and got the following result

Fit Y by X b.jpg

Clearly from this graph teams that qualify have on average a higher Combined EPA then teams that didnt qualify.

Given these 2 graphs could someone please explain to me the correlation in the 1st graph between Not qualifying and a higher Combined EPA?

Thank you in advance for any assistance

D Scott

4 REPLIES 4
Rick_SAS
SAS Super FREQ

You are correct to be confused. I understand the second graph. However, the first graph has some unknown continuous variable plotted on the vertical axis.  Notice that there are points with vertical coordinates 0.3 and 0.25, so the vertical graph is not the plot of a dicotomous variable.

Is this output from JMP? Perhaps the JMP user's guide would shed some light on what is being plotted. There is also a dedicated JMP discussion forum at https://communities.sas.com/community/support-communities/jmp_software

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

If you were using the LOGISTIC procedure in SAS, I would ask to see your code. But it looks like you are using JMP (???). In LOGISTIC, the default is to model the probability of "0" (or the lowest code), not the probability of "1". This causes confusion for many people, because everything could appear reversed. There are ways to reverse this in LOGISTIC (including using DESCENDING in the procedure statement). Don't know about JMP. Your first graph is likely a plot of the estimated probability (a continuous variable from 0-1) versus the predictor.

Rick_SAS
SAS Super FREQ

The Parameter Estimates table only shows one regressor. Are you showing us the whole table?

LVM: if there were multiple covariates, I would say your guess is correct, but for 1-variable regression, isn't the predicted probability equal to the sigmoidal curve that is shown? How can several observations that have the same value of X get mapped to distinct predicted probabilities?  I'm not doubting your explanation about the interpretation of the scatter plot, but it seems like there is more going on here than has been revealed.

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

Agree, without seeing the code or the data, I really can't tell what is being plotted.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1407 views
  • 0 likes
  • 3 in conversation