Solved: Confidence intervals from ESTIMATE statement in PROC NLMIXED

RyanSimmons · Posted 08-04-2016 09:57 AM

I ran a binomial GLMM in PROC NLMIXED, and calculated the odds ratios using ESTIMATE statements, as follows:

PROC NLMIXED data=AMA_transition method=gauss maxiter=1000 gconv=0 qpoints=100;

	parms B0 = -3.5730
	      B1 = 0.1174
	      B2 = -0.1248
	      B3 = 1.0668
	      B4 = 1.2854
	      B5 = 1.5513
	      B6 = -0.6355
	      B7 = 2.8348
	      sigma = 1
	;

	mu = B0 + u + B1*agecat1 + B2*agecat2 + B3*malaria + B4*last_malaria + B5*msp_pos + B6*last_msp_pos + B7*last_ama_pos;
	p = exp(mu)/(1+exp(mu));

	model ama_pos ~ binary(p);

	random u ~ normal(0,sigma) subject=id_num;

	estimate 'Age (1) OR:' exp(B1);
	estimate 'Age (2) OR:' exp(B2);
	estimate 'Malaria OR:' exp(B3);
	estimate 'Last Malaria OR:' exp(B4);
	estimate 'MSP Positive OR:' exp(B5);
	estimate 'Last MSP Positive OR:' exp(B6);
	estimate 'Last AMA Positive OR:' exp(B7);
run;

I've attached a screenshot of the output for the Parameter Estimates and the "Additional Estimates" calculate in the ESTIMATE statements.

What is confusing me are the confidence intervals for the exponentiated parameter estimates in the "Additional Estimates" tables. For parameters B1, B2, B5, B6, and B7, the estimates in the "Additional Estimates" table are close to what I get if I manually exponentiate the figures given in the Parameter Estimates table. However, for parameters B3 and B4 specifically, although the point estimate is precisely equal to the exponentiated point estimate for the parameter, the confidence intervals are not! If I manually exponentiate the confidence intervals for B3 and B4, they should be (1.5031, 8.0736) and (1.6555, 9.9016), respectively. However, the confidence intervals calculated by the ESTIMATE statement are (0.5555, 6.4116) and (0.4278, 7.6696)! It makes no sense for these exponentiated lower limits to be less than 1, given that the lower limits of the parameter estimate are positive.

Why are the upper and lower limits being miscalculated for these parameters specifically but not the others?

(All of the above is using SAS 9.4, maintenance release 3)

lvm · Posted 08-04-2016 04:33 PM

Good question. Although the results are different for some parameter estimates, in fact, both answers are "correct". A different approach is being taken. ALthough you did not say it, it appears that you are taking the exponents of the confidence limits (CLs) for estimated B1, B2, ..., B7. That is, get the CLs for B1, etc., and then use the exponential function. This is often a good practice, and is actually done in GLIMMIX with the EXP option on a LSMEAN statement. This is not what NLMIXED is doing. The latter procedure is using large sample theory to estimate the SE of exp(B_), and then simply calculating the CLs as +/- SE*t using the SE estimate. It is not calculating the exponent of the CLs of B1, etc. The two methods are both reasonable, although the NLMIXED approach can be most justified based on large-sample theory -- small-sample properties not well established.

The two approaches could give very similar results or fairly different results. This depends on the actual point estimates, SEs of the estimates, and the sampling distribution of the estimated B's (the latter not really known for small samples). In general, if SE(B) is small, then the two approaches will be more similar.

Here is an example. Note that the variance of a function of a random variable (such as an estimated parameter) is

var(f(x)) = [f'(x)]^2 * var(x),

where here x is the parameter estimate, f(x) = exp(x) in your case, and f'(x) is the first derivative. SE(f(x)) is then just the square-root of this: SE(f(x)) = f'(x)*SE(x). With the special case that f(x) = exp(x), one notes that f'(x) = f(x) = exp(x). Thus,

SE(exp(x)) = exp(x)*SE(x).

This is known as the delta method, is is an approximated derived from a Taylor series expansion.

Consider B2. SE(exp(B2)) = exp(B2)*SE(B2) = 0.8777 * 0.05123 = 0.04496. This is displayed in the Addtional Estiamtes table, and used to get the CLs (0.8777 +/- t*0.04496). This close to the limits you get by getting exp(.) for the limits in the Parameter table. Now consider B3. SE(exp(B3)) = exp(B3)*SE(B3) = 3.4834*0.4281 = 1.4912. This is displayed in your Additional Estimates table. In this case, 3.4834 +/- t*1.49 (displayed as the Lower and Upper limits) is not close to your other method.

As I said, both approaches can be justified.

View solution in original post

lvm · Posted 08-04-2016 04:33 PM

Good question. Although the results are different for some parameter estimates, in fact, both answers are "correct". A different approach is being taken. ALthough you did not say it, it appears that you are taking the exponents of the confidence limits (CLs) for estimated B1, B2, ..., B7. That is, get the CLs for B1, etc., and then use the exponential function. This is often a good practice, and is actually done in GLIMMIX with the EXP option on a LSMEAN statement. This is not what NLMIXED is doing. The latter procedure is using large sample theory to estimate the SE of exp(B_), and then simply calculating the CLs as +/- SE*t using the SE estimate. It is not calculating the exponent of the CLs of B1, etc. The two methods are both reasonable, although the NLMIXED approach can be most justified based on large-sample theory -- small-sample properties not well established.

The two approaches could give very similar results or fairly different results. This depends on the actual point estimates, SEs of the estimates, and the sampling distribution of the estimated B's (the latter not really known for small samples). In general, if SE(B) is small, then the two approaches will be more similar.

Here is an example. Note that the variance of a function of a random variable (such as an estimated parameter) is

var(f(x)) = [f'(x)]^2 * var(x),

where here x is the parameter estimate, f(x) = exp(x) in your case, and f'(x) is the first derivative. SE(f(x)) is then just the square-root of this: SE(f(x)) = f'(x)*SE(x). With the special case that f(x) = exp(x), one notes that f'(x) = f(x) = exp(x). Thus,

SE(exp(x)) = exp(x)*SE(x).

This is known as the delta method, is is an approximated derived from a Taylor series expansion.

Consider B2. SE(exp(B2)) = exp(B2)*SE(B2) = 0.8777 * 0.05123 = 0.04496. This is displayed in the Addtional Estiamtes table, and used to get the CLs (0.8777 +/- t*0.04496). This close to the limits you get by getting exp(.) for the limits in the Parameter table. Now consider B3. SE(exp(B3)) = exp(B3)*SE(B3) = 3.4834*0.4281 = 1.4912. This is displayed in your Additional Estimates table. In this case, 3.4834 +/- t*1.49 (displayed as the Lower and Upper limits) is not close to your other method.

As I said, both approaches can be justified.

Confidence intervals from ESTIMATE statement in PROC NLMIXED

Re: Confidence intervals from ESTIMATE statement in PROC NLMIXED

Re: Confidence intervals from ESTIMATE statement in PROC NLMIXED

Confidence intervals from ESTIMATE statement in PROC NLMIXED

Re: Confidence intervals from ESTIMATE statement in PROC NLMIXED

Re: Confidence intervals from ESTIMATE statement in PROC NLMIXED

The 2025 SAS Hackathon has begun!

SAS Training: Just a Click Away