Solved: Re: OR in proc logistic

HansP · Posted 03-05-2019 03:49 AM

Is it possible to calculate the OR for the risk of doubling the value of X?

I know that I can use UNITS = 1 or UNITS = SD, but I don't understand that people are reporting ORs for doubling X, as doubling might be from X = 1 to X = 2 (that's a one unit change) but also e.g. from X = 2 to X = 4 (which is a 2 unit change). This would give different ORs. Therefore, I wonder if it is possible to calculated just one OR for doubling a specific covariate.

FreelanceReinh · Posted 03-05-2019 10:54 AM

If the model was linear in the logit for a logarithmically transformed predictor, then I would apply this transformation. However, if the linearity assumption was rather satisfied with the untransformed predictor, then I wouldn't transform it. In this case the "OR for the risk of doubling the value of" this predictor would "naturally" depend on the predictor value (1 vs. 2 unlike 2 vs. 4, etc.).

@HansP wrote:
In a univariate model = X compared to model = Log(X)/Log(2), the p-value also changes.

Yes, I would expect this behavior because it's a nonlinear transformation. The question is: Is there a good reason for the logarithmic transformation, e.g., in terms of model assumptions (see above), goodness-of-fit, theoretical considerations?

View solution in original post

FreelanceReinh · Posted 03-05-2019 06:06 AM

Hello @HansP and welcome to the SAS Support Communities!

You could derive a new covariate L=log(X)/log(2) and then compute the odds ratio per one-unit increase in L.

Example (target odds ratio: 2.34):

/* Create test data for demonstration */

data test;
call streaminit(3141592);
do id=1 to 1000000;
  X=rand('uniform',0.5,5);
  L=log(X)/log(2);
  p=logistic(-3.45+log(2.34)*L);
  event=rand('bern',p);
  output;
end;
run;

/* Estimate odds ratio for doubling X */

proc logistic data=test desc;
model event=L;
run;

/* Check two empirical odds ratios */

proc freq data=test;
where round(x,.1) in (1,2);
format x 1.;
tables x*event / or nopercent nocol norow;
run;

proc freq data=test;
where round(x,.1) in (2,4);
format x 1.;
tables x*event / or nopercent nocol norow;
run;

PROC LOGISTIC output (excerpt):

           Odds Ratio Estimates

             Point          95% Wald
Effect    Estimate      Confidence Limits

L            2.335       2.310       2.360

PROC FREQ output (excerpts):

X         event

Frequency|       0|       1|  Total
---------+--------+--------+
       1 |  21706 |    698 |  22404
---------+--------+--------+
       2 |  20721 |   1557 |  22278
---------+--------+--------+
Total       42427     2255    44682

Statistic                        Value       95% Confidence Limits
------------------------------------------------------------------
Odds Ratio                      2.3367        2.1328        2.5600

Sample Size = 44682


X         event

Frequency|       0|       1|  Total
---------+--------+--------+
       2 |  20721 |   1557 |  22278
---------+--------+--------+
       4 |  18811 |   3285 |  22096
---------+--------+--------+
Total       39532     4842    44374

Statistic                        Value       95% Confidence Limits
------------------------------------------------------------------
Odds Ratio                      2.3241        2.1812        2.4763

Sample Size = 44374

HansP · Posted 03-05-2019 07:20 AM

Thank you very much for your answer.

What about using this in a multivariable model? Should I transform all continuous variables using log(X)/log(2)?

In a univariate model = X compared to model = Log(X)/Log(2), the p-value also changes.

FreelanceReinh · Posted 03-05-2019 10:54 AM

If the model was linear in the logit for a logarithmically transformed predictor, then I would apply this transformation. However, if the linearity assumption was rather satisfied with the untransformed predictor, then I wouldn't transform it. In this case the "OR for the risk of doubling the value of" this predictor would "naturally" depend on the predictor value (1 vs. 2 unlike 2 vs. 4, etc.).

@HansP wrote:
In a univariate model = X compared to model = Log(X)/Log(2), the p-value also changes.

Yes, I would expect this behavior because it's a nonlinear transformation. The question is: Is there a good reason for the logarithmic transformation, e.g., in terms of model assumptions (see above), goodness-of-fit, theoretical considerations?

OR in proc logistic

Re: OR in proc logistic

Re: OR in proc logistic

Re: OR in proc logistic

Re: OR in proc logistic

SAS Innovate 2026 Registration is Open