Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- QUANTREG estimated output

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-23-2015 07:01 PM

Could anybody kindly offer some advice regarding the following issue I'm having with the Quantreg procedure?

Basically, the estimated response (using the PREDICTED keyword in the OUTPUT statement) is giving me one set of estimates. But manually calculating the estimated response using the regression parameters given by the same Quantreg procedure gives a different set of estimates. I would have expected the two to be exactly the same. They are close, but still significantly different. Is there a reason they should be different?

Long story below...

I have a large collection of birthweight data and am trying to establish centile growth curves for these data using Quantreg (i.e., estimate the weight of the infant at different gestational ages). I am fitting the centile curves to a 4th order polynomial with gestational age (**gacorr**) being the independent variable, and birthweight (**bweight**) the dependent variable.

Here is a sample of the raw data:

Obs | bweight | bsex | labouronset | gacorr |
---|---|---|---|---|

1 | 160 | M | spontaneous | 17.3593 |

2 | 230 | M | spontaneous | 18.3720 |

3 | 340 | M | spontaneous | 19.2857 |

4 | 270 | M | spontaneous | 19.2857 |

5 | 360 | M | spontaneous | 19.5714 |

Here is the quantreg procedure that I'm using to create a 10^{th} percentile growth curve:

**proc** **quantreg** data = AllMaleSP

algorithm = interior (kappa = **0.9**) ci = resampling plots (maxpoints = none);

where gacorr >=**26** and gacorr <=**42**;

model bweight = gacorr gacorr*gacorr gacorr*gacorr*gacorr gacorr*gacorr*gacorr*gacorr

/ quantile = **0.1**;;

output out=AllMaleSPPred predicted = pred;

**run**;

Here are the regression parameters estimated by quantreg for the 0.1 centile:

Intercept -67485.3

gacorr 8458.845

gacorr*gcorr -397.300

gacorr*gacorr*gacorr 8.3216

gacorr*gacorr*gacorr*gacorr -0.0644

And here is a sample of the output of quantreg, including the estimated response (**pred**):

Obs | bweight | bsex | labouronset | gacorr | pred | QUANTILE |
---|---|---|---|---|---|---|

1 | 987 | M | spontaneous | 26 | 723.147 | 0.1 |

2 | 746 | M | spontaneous | 26 | 723.147 | 0.1 |

3 | 995 | M | spontaneous | 26 | 723.147 | 0.1 |

4 | 840 | M | spontaneous | 26 | 723.147 | 0.1 |

5 | 760 | M | spontaneous | 26 | 723.147 | 0.1 |

So the PREDICTED output of the quantreg procedure estimates a 10^{th} centile birthweight of **723g** at 26 weeks.

But if I use the actual parameter estimates, and plug 26 weeks into the regression formula -67485.3 + 8458.845*gacorr -397.300*gacorr^2 +8.3216*gacorr^3 -0.0644*gacorr^4, I get **701g**.

I’ve checked for other gestations, as well as other subsets of the data and other centiles, and get similar discrepancies. I’m reluctant to use the regression formula without knowing the reason it gives a different estimate to the PREDICTED output.

Any ideas? I would be greatly appreciate any guidance. Thank you.

Accepted Solutions

Solution

08-24-2015
08:58 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to farmeister

08-24-2015 08:58 AM

The parameter estimates that appear in the ODS table are formatted, so you are seeing rounded values. So, for example, the coefficient of the quartic term could be anywhere between -0.06435 and -0.064449 and be formatted as -0.0644. Because the data are not centered, small changes in the coefficient of 26**4 will make a big difference in the predicted values:

data a;

gacorr = 26;

a4 = -0.06435;

pred = -67485.3 + 8458.845*gacorr -397.300*gacorr**2 +8.3216*gacorr**3 + a4*gacorr**4;

output;

a4 = -0.064449;

pred = -67485.3 + 8458.845*gacorr -397.300*gacorr**2 +8.3216*gacorr**3 + a4*gacorr**4;

output;

run;

proc print;

run;

To get non-formatted estimates, use ODS OUTPUT to create a SAS data set from the ParameterEstimates table. If you use the values in the data set, the predicted values should agree with the results of the procedure.

All Replies

Solution

08-24-2015
08:58 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to farmeister

08-24-2015 08:58 AM

The parameter estimates that appear in the ODS table are formatted, so you are seeing rounded values. So, for example, the coefficient of the quartic term could be anywhere between -0.06435 and -0.064449 and be formatted as -0.0644. Because the data are not centered, small changes in the coefficient of 26**4 will make a big difference in the predicted values:

data a;

gacorr = 26;

a4 = -0.06435;

pred = -67485.3 + 8458.845*gacorr -397.300*gacorr**2 +8.3216*gacorr**3 + a4*gacorr**4;

output;

a4 = -0.064449;

pred = -67485.3 + 8458.845*gacorr -397.300*gacorr**2 +8.3216*gacorr**3 + a4*gacorr**4;

output;

run;

proc print;

run;

To get non-formatted estimates, use ODS OUTPUT to create a SAS data set from the ParameterEstimates table. If you use the values in the data set, the predicted values should agree with the results of the procedure.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

08-24-2015 06:40 PM

Thank you!

I've taken the parameter estimates directly from OUTEST and they match almost perfectly now.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to farmeister

08-28-2015 07:28 AM

Glad you were able to do so, but I have a different question. Why fit a quartic polynomial? Unless you have a good biological reason, wouldn't some other model, perhaps using the EFFECT statement to fit a spline, have resulted in superior performance? I've been digging through my mathematical biology references and I don't see much evidence for any biological processes that give rise to a fourth order response.

Steve Denham