BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Haris
Lapis Lazuli | Level 10

I am trying to program correct marginal Predicted Probabilities (PP) from a multi-level logistic regression output.  Pavlou (2015)  has a formula attributed to Skrondal  (2009) where row-level PPs need to be integrated over the distribution of random effects:

Haris_0-1694700282220.png

Expression in the square brackets -- except for 'u' -- is marginal linear predictor.  Random effect (RI) gets added to that expression and the whole thing gets translated into PP.  That entity, then, is integrated over the distribution of random effects with the mean of zero and standard deviation being the variance of the random effecs from the model inquestion.

I am a novice IML programmer and, for the life of me, can't figure out how to program this in SAS.  My variances of RIs range from 0.1 to 0.5 and I don't  even know where to put that in the QUAD() funcion.  The best shot is below.  Any advice is appreciated. 

 

Would averaging PP over all sample RIs be close enough to the integration approach?  I am working with anywhere from 150 to over 1,000 random units.

 

proc IML;
	start IntRI(RI);
	  PP = 0.1;
	  lgt = log(PP/(1-PP));
	  return( (1 + exp(-(lgt+RI)))**-1 );
	finish;
 
	call quad(iPP, "IntRI", {.M,.P});
	print iPP;
quit;
1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

I don't understand the formula in terms of the random effects model. It looks like it is for a single random effect, but you keep referring to random effects (plural).

 

However, I can offer assistance with evaluating the integral. Everything except the u and f(u) is constant, so specify the PDF of u in the integrand. For example, if f is the normal density function with mean 0 and the standard deviation of the random effect is sigma=0.73, then you would use the following code to evaluate the integral at a specific point in the space of the explanatory variables:

 

proc IML;
beta = {2.4, -2, 3};  /* coeffs of logistic regression (Intercept, beta_x, beta_y)*/
X =    {1  0.5 0.7};  /* evaluate model at (x,y)=(0.5 0.7) [add intercept to design] */
eta = X*beta;         /* linear predictor at (x,y) */

start Integrand(u) global(eta);
  L = logistic(u + eta);
  fu = pdf("Normal", u, 0, 0.73);  /* specify f(u) here. eg, PDF for N(0,0.73) */
  return( L * fu );
finish;

call quad(iPP, "Integrand", {.M,.P});
print iPP;

 

 

View solution in original post

11 REPLIES 11
Rick_SAS
SAS Super FREQ

I don't understand the formula in terms of the random effects model. It looks like it is for a single random effect, but you keep referring to random effects (plural).

 

However, I can offer assistance with evaluating the integral. Everything except the u and f(u) is constant, so specify the PDF of u in the integrand. For example, if f is the normal density function with mean 0 and the standard deviation of the random effect is sigma=0.73, then you would use the following code to evaluate the integral at a specific point in the space of the explanatory variables:

 

proc IML;
beta = {2.4, -2, 3};  /* coeffs of logistic regression (Intercept, beta_x, beta_y)*/
X =    {1  0.5 0.7};  /* evaluate model at (x,y)=(0.5 0.7) [add intercept to design] */
eta = X*beta;         /* linear predictor at (x,y) */

start Integrand(u) global(eta);
  L = logistic(u + eta);
  fu = pdf("Normal", u, 0, 0.73);  /* specify f(u) here. eg, PDF for N(0,0.73) */
  return( L * fu );
finish;

call quad(iPP, "Integrand", {.M,.P});
print iPP;

 

 

Haris
Lapis Lazuli | Level 10

Thanks so much, Rick.

 

Extra background detail.  There is a problem with marginal observation-level Predicted Probabilities from logistic regression when random effects are involved.  Straight up linear predictor expressed as probability is systematically different from the 'correct' value.  See this for more detail (https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-015-0046-6).

 

The correct way to calculated marginal probabilities according to the reference and the formula I pasted, is to integrate Predicted Probability for each row over the distribution of all random intervals in the model.  I wish SAS had this correction built in into all PROCs that do multi-level regression.  I requested that feature.  In the absence of it, my starting point is a marginal NOBLUP Predicted Probability value for each row in the model from GLIMMIX, for example.  From there, I back-transform it to linear predictor so that the LP could be integrated over the distribution of random effects or Random Intervals (the u term).  For each row in the model (patients), the Predicted Probability needs to be corrected using the distribution of random intercepts (hospitals).

Normal integral QUAD() gives me AUC for the transformation without any density weights.  What you shared seems to do the trick.  I tested a couple of values and it seems to correctly increase the input for values below 0.5 and decreases them for values above .5.  I modified the code slightly and now just need to figure out how to apply it to an entire column of Predicted Probability values.  I think I can simplify this a little bit by outputing Linear Predictors from the model rather than Predicted Probabilities -- skip the `iLink` option.

proc IML;
PredProb = .73;              /* Marginal Predicted Probability from RE GLMM */
Eta 	 = log(PP / (1-PP)); /* Eta = linear predictor                      */

start Integrand(u) global(eta);
  L  = logistic(u + eta);
  fu = pdf("Normal", u, 0, 0.2);  /* specify f(u) here. eg, PDF for N(0,0.20) */
  return( L * fu );
finish;

call quad(iPP, "Integrand", {.M,.P});
print PP Eta iPP;
quit;

Thanks again.  Happy to discuss further.  Hope you can put in a good word with SAS development teams to add this correction to the SAS/STAT core PROCS.   Would love to hear if you disagree with the diagnosis for marginal PPs.

Haris
Lapis Lazuli | Level 10

Oh, and one more thing, I think the notation in the formula is what's throwing you off.  "RE" subscripts for alpha and beta signify that it's a Random Effects model rather than that alpha and beta are random effects coefficients.  Alpha and Beta are fixed.  U is the random intercept.

 

Haris
Lapis Lazuli | Level 10

And I am a total IML idiot.  Why is this code not working?

proc iml;
	use SAShelp.class;
	read all var {Age};

	PP	= Age/100;
	Eta	= log(PP / (1-PP));

	start Integrand(u) global(eta);
	  L  = logistic(u + eta);
	  fu = pdf("Normal", u, 0, 0.2);
	  return( L * fu );
	finish;

	call quad(iPP, "Integrand", {.M,.P}) scale=.0001;
	print iPP;

Haris_0-1694714117977.png

Rick_SAS
SAS Super FREQ

It's okay to have questions. Especially when the software or technique is new to you. We were all beginners once.

 

In your example, Age is a vector. Therefore, ETA is a vector. The integrand requires a scalar value. If you want to evaluate the integral for each student's age, you can do the following:


proc iml;
	use SAShelp.class;
	read all var {Age};

	PP	= Age/100;
	EtaVec	= log(PP / (1-PP));

	start Integrand(u) global(eta);
	  L  = logistic(u + eta);
	  fu = pdf("Normal", u, 0, 0.2);
	  return( L # fu );
	finish;

   Integral = j(nrow(EtaVec), 1);
   do i = 1 to nrow(EtaVec);
      Eta = EtaVec[i,];   /* get i_th row */
   	call quad(iPP, "Integrand", {.M,.P});
      Integral[i] = iPP;
	end;
   print Age EtaVec Integral;

Haris
Lapis Lazuli | Level 10

You are too kind.  Much obliged!  Thanks! 

 

It's a challenge to transition from decades of just columns, rows and cells to vectors and arrays.

Rick_SAS
SAS Super FREQ

Yes, it certainly is. It can be frustrating. It takes time, patience, and hard work. But hopefully we can help help by answering your questions.

If you decide that you want to learn more about programming in the SAS IML language, here are some helpful general tips: Tips for learning the SAS/IML language - The DO Loop

 

Haris
Lapis Lazuli | Level 10

Thanks.  Even better: I bought YOUR book 🙂

Season
Pyrite | Level 9

Hello, I am interested in this field. But the URL you pasted seems not working and does not link to any material. Could you paste the URL again or report the details (title, author, etc.) of the paper? Thank you!

Haris
Lapis Lazuli | Level 10
Closing parenthesis got included in the link. Remove it and the link works. Just checked.

<>
[12874.jpg]
A note on obtaining correct marginal predictions from a random intercepts model for binary outcomes - BMC Medical Research Methodology<>
bmcmedresmethodol.biomedcentral.com<>

Season
Pyrite | Level 9
Thank you very much!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 11 replies
  • 2165 views
  • 9 likes
  • 3 in conversation