PROC LOGISTIC: Positive effect in logistic model where a negative one ...

halkyos · Posted 09-27-2019 03:45 PM

I am working with risk and protective factor data for outcomes regarding substance use. My data is arranged so that I have an outcome as a binary variable (0=no use, 1= use), the total number of risk factors and the total number of protective factors. Risk factors are known to increase the likelihood of an outcome occurring and protective factors are known to have an opposite effect. Examination in PROC FREQ shows that the proportion of observations using a substance increases with the number of risk factors and decreases with the number of protective factors. When I use PROC LOGISTIC though to write a model, I am getting a positive effect from my protective factors. Here is my code:

PROC LOGISTIC DATA=survey DESCENDING;
  MODEL sub1= rfs pfs;
RUN;

sub1: binary variable where 1= using the substance and 0=not using the substance.

rfs: total number of risk factors.

pfs: total number of protective factors.

My results for one substance are giving me a model of p(1)=-3.1860+0.3033(rfs)+0.1181(pfs). As a researcher I know that this is wrong, I don't have anomalous data where the population is more likely to use substances if they have more protective factors, but I am having trouble figuring out how to correct this.

Reeza · Posted 09-27-2019 03:48 PM

Can you post your log?
And are rfs and pfs continuous or categorical?

halkyos · Posted 09-27-2019 04:00 PM

Here is my log:

306 PROC LOGISTIC DATA=survey DESCENDING;
307 MODEL sub1=rfs pfs;
308 RUN;

NOTE: PROC LOGISTIC is modeling the probability that sub1=1.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 14445 observations read from the data set WORK.SURVEY.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.25 seconds
cpu time 0.15 seconds

rfs and pfs are positive whole integers unless they are 0 representing the number of risk or protective factors present. rfs ranges from 0-21 and pfs ranges from 0-12.

PaigeMiller · Posted 09-27-2019 03:49 PM

One reason this can occur is if your two x-variables rfs and pfs are highly correlated with each other. Another reason this can occur is if you have outliers or clusters in rfs and/or pfs.

--
Paige Miller

halkyos · Posted 09-27-2019 04:41 PM

I just tested for these possibilities: A chi-square test for independence indicates that rfs and pfs are independent. When entered into PROC AUTOREG for rfs=pfs and pfs=rfs the values are negatively correlated to each other:

chi-sq: 4876.6114, p<0.0001.

rfs=12.9714-0.7614(pfs), p<0.0001.

pfs=8.7403-0.2989(rfs), p<0.0001.

The overall distributions are almost textbook normal, and when stratified to whether or not the observation reported substance use the distribution of rfs for non substance users takes on a right-tail skew. All other distributions remain normal.

Reeza · Posted 09-27-2019 04:59 PM

P<0.0001 means related not indepedent, doesn't it?

PaigeMiller · Posted 09-27-2019 05:14 PM

@halkyos wrote:

I just tested for these possibilities: A chi-square test for independence indicates that rfs and pfs are independent. When entered into PROC AUTOREG for rfs=pfs and pfs=rfs the values are negatively correlated to each other:

chi-sq: 4876.6114, p<0.0001.

rfs=12.9714-0.7614(pfs), p<0.0001.

pfs=8.7403-0.2989(rfs), p<0.0001.

The overall distributions are almost textbook normal, and when stratified to whether or not the observation reported substance use the distribution of rfs for non substance users takes on a right-tail skew. All other distributions remain normal.

What is the correlation (not the auto-correlation from PROC AUTOREG but the correlation from PROC CORR) between rfs and pfs? Distribution of your x-variables is irrelevant here. Are there outliers or clusters among your x-variables?

--
Paige Miller

halkyos · Posted 10-01-2019 11:54 AM

My PROC CORR results are as follows:

There are no high or low outliers for either variable.

Ksharp · Posted 09-28-2019 07:57 AM

Change your response value which model the prob ,and you get the different result

PROC LOGISTIC DATA=survey ;  MODEL sub1(event='0')  = rfs pfs;RUN;

V.S.

PROC LOGISTIC DATA=survey ;  MODEL sub1(event='1')= rfs pfs;RUN;

halkyos · Posted 09-30-2019 10:01 AM

So I tried this before coming onto here, what is does is switches which of the two has a larger positive coefficient, but both remain positive. My office is renewing my license today so I can't currently give you the exact coefficients, but what happens is it becomes sub1=y+pfs+rfs where the coefficient of pfs> coefficient of rfs; 0<= either coefficient <= 1.

Ksharp · Posted 10-01-2019 12:34 AM

Did you Check the standard error of these two coefficient ?

halkyos · Posted 10-01-2019 11:41 AM

The standard errors are as follows:

rfs: 0.0417

pfs: 0.0261

PaigeMiller · Posted 10-01-2019 07:04 AM

What is the correlation (not the auto-correlation from PROC AUTOREG but the correlation from PROC CORR) between rfs and pfs?

Are there outliers or clusters among your x-variables?

--
Paige Miller

Reeza · Posted 10-01-2019 10:50 AM

Can you show a PROC FREQ of rfs*pfs I suspect you have some massive imbalances.

halkyos · Posted 10-01-2019 11:47 AM

PROC CORR is new to me, but looking at the guide on that one it seems pretty straightforward. I ran:

PROC CORR DATA=survey;
	VAR rfs pfs;
RUN;

My results are:

PROC LOGISTIC: Positive effect in logistic model where a negative one should occur

Re: PROC LOGISTIC: Positive effect in logistic model where a negative one should occur

Re: PROC LOGISTIC: Positive effect in logistic model where a negative one should occur

Re: PROC LOGISTIC: Positive effect in logistic model where a negative one should occur

Re: PROC LOGISTIC: Positive effect in logistic model where a negative one should occur

Re: PROC LOGISTIC: Positive effect in logistic model where a negative one should occur

Re: PROC LOGISTIC: Positive effect in logistic model where a negative one should occur

Re: PROC LOGISTIC: Positive effect in logistic model where a negative one should occur

Re: PROC LOGISTIC: Positive effect in logistic model where a negative one should occur

Re: PROC LOGISTIC: Positive effect in logistic model where a negative one should occur

Re: PROC LOGISTIC: Positive effect in logistic model where a negative one should occur

Re: PROC LOGISTIC: Positive effect in logistic model where a negative one should occur

Re: PROC LOGISTIC: Positive effect in logistic model where a negative one should occur

Re: PROC LOGISTIC: Positive effect in logistic model where a negative one should occur

Re: PROC LOGISTIC: Positive effect in logistic model where a negative one should occur

Registration is open