- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am trying to extrapolate a regression line for a stability test-retesting. I need to show at what point the 95% Confidence interval, when extended, will hit the acceptance criteria. I have data up to 18 months and want to extrapolate the line to 24 months. The guidelines require that the 95% confidence limit be extended to see where the limit hits the acceptance criteria. If I include in the input data the extra timepoints (i.e. 21 and 24 months) with missing data, Proc Reg gives me the predication values for those timepoints as well. It also gives me confidence limits for those time points. So I actually have my answer. What my question is - is how are these confidence intervals calculated.
Example code looks as follows:
data a;
input time result;
cards;
0 13
3 23
6 26
9 32
12 30
15 33
18 34
21 .
24 .
;
run;
proc reg data=a;
model result=time;
output out=reg p=pred uclm=upper lclm=lower;
run;
quit;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Others have already raised questions about whether PROC REG is the appropriate tool, but the answer to your question is found in the SAS documentation for predicted values. See the equations for LowerM and UpperM.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Is this an academic question or you're trying to implement this in your corporation?
Linear regression is not appropriate for time series data, assumptions of independence are violated.
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.3/statug/statug_reg_details33.htm
Prediction intervals for future are a different calculation than confidence intervals on known data points.
https://stats.stackexchange.com/questions/16493/difference-between-confidence-intervals-and-predicti...
EDIT:
The technique you're using is outlined here:
Other options for scoring:
https://blogs.sas.com/content/iml/2014/02/19/scoring-a-regression-model-in-sas.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your response. I will check out the references you sent. In the meanwhile I wanted to answer your question.
I am analyzing stability data and following the FDA guidelines
Q1E Evaluation of Stability Data.
(The data I sent was example data to illustrate the question. )
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Drug stability over time? Doesn't that usually require survival analysis?
It's been a while since I've done a clinical trial though (a decade).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The guidelines say specifically that "Regression analysis is considered an appropriate approach to evaluating the stability data for a quantitative attribute and establishing a retest period or shelf life." It also says "An appropriate approach to retest period or shelf life estimation is to analyze a quantitative
attribute (e.g., assay, degradation products) by determining the earliest time at which the 95
percent confidence limit for the mean intersects the proposed acceptance criterion"
The figure in the guidelines shows data collected up to 12 months and then extrapolation to show the degradation would be within the acceptance range at 24 months (the guidelines have rules for how long you may extrapolate and also will need to update with longer term data - but this is acceptable).
So I think that I am ok with using the regression procedure.
The output dataset that is generated using the sample data above includes the missing points. This is what it generates:
time | result | Predicted Value | Lower Bound | Upper Bound |
of result | of 95% C.I. | of 95% C.I. | ||
for Mean | for Mean | |||
0 | 13 | 17.9643 | 11.841 | 24.0876 |
3 | 23 | 21.0714 | 16.2679 | 25.8749 |
6 | 26 | 24.1786 | 20.3811 | 27.9761 |
9 | 32 | 27.2857 | 23.8891 | 30.6823 |
12 | 30 | 30.3929 | 26.5954 | 34.1904 |
15 | 33 | 33.5 | 28.6965 | 38.3035 |
18 | 34 | 36.6071 | 30.4838 | 42.7304 |
21 | . | 39.7143 | 32.1193 | 47.3093 |
24 | . | 42.8214 | 33.6758 | 51.967 |
What I want to understand is how the 95% confidence interval in the proc reg procedure is calculated for the extended period - in my example above, that is for months 21 and 24.
(I am still working through your links).
Thanks for your help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I suggest post your question at Forecasting Forum .
Check PROC ARIMA , PROC UCM , PROC ESM ,PROC FORECAST ......
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Others have already raised questions about whether PROC REG is the appropriate tool, but the answer to your question is found in the SAS documentation for predicted values. See the equations for LowerM and UpperM.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It helps to have some experience with regulators on this. Linear extrapolation of stability data has been done for years. While that may not make it "right", there is an impressive track record. The problem with shifting to any of the time series methods is that the number of time points on these studies is often too small to be able to estimate the parameters, the time points are not equally spaced, and the measurements are not on the same sample (i.e., several samples are taken at the initiation of the stability testing period, and then destructively analyzed at pre-determined time points). Consequently, a lot of time series methods aren't robust enough to deal with this.
As far as survival analysis, I think you would need multiple samples to analyze at each pre-determined time point to be able to estimate the hazard ratio (failure rate). Unless the synthesis has been scaled up to a near production level, there may not be sufficient test article to carry out a decent survival analysis.
So for once, I can understand the use of a not-quite-right method.
SteveDenham
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content