Statistical Procedures

nsns · Posted 10-26-2021 02:51 PM

Hello,

I am trying to extrapolate a regression line for a stability test-retesting. I need to show at what point the 95% Confidence interval, when extended, will hit the acceptance criteria. I have data up to 18 months and want to extrapolate the line to 24 months. The guidelines require that the 95% confidence limit be extended to see where the limit hits the acceptance criteria. If I include in the input data the extra timepoints (i.e. 21 and 24 months) with missing data, Proc Reg gives me the predication values for those timepoints as well. It also gives me confidence limits for those time points. So I actually have my answer. What my question is - is how are these confidence intervals calculated.

Example code looks as follows:

data a;
input time result;
cards;
0 13
3 23
6 26
9 32
12 30
15 33
18 34
21 .
24 .
;
run;

proc reg data=a;
model result=time;
output out=reg p=pred uclm=upper lclm=lower;
run;
quit;

Rick_SAS · Posted 10-27-2021 10:02 AM

Others have already raised questions about whether PROC REG is the appropriate tool, but the answer to your question is found in the SAS documentation for predicted values. See the equations for LowerM and UpperM.

View solution in original post

Reeza · Posted 10-26-2021 03:04 PM

Is this an academic question or you're trying to implement this in your corporation?

Linear regression is not appropriate for time series data, assumptions of independence are violated.
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.3/statug/statug_reg_details33.htm

Prediction intervals for future are a different calculation than confidence intervals on known data points.
https://stats.stackexchange.com/questions/16493/difference-between-confidence-intervals-and-predicti...

EDIT:

The technique you're using is outlined here:

https://blogs.sas.com/content/iml/2014/02/17/the-missing-value-trick-for-scoring-a-regression-model....

Other options for scoring:

https://blogs.sas.com/content/iml/2014/02/19/scoring-a-regression-model-in-sas.html

nsns · Posted 10-26-2021 03:23 PM

Dear Reeza,
Thank you for your response. I will check out the references you sent. In the meanwhile I wanted to answer your question.
I am analyzing stability data and following the FDA guidelines
Q1E Evaluation of Stability Data.
(The data I sent was example data to illustrate the question. )
Thanks.

Reeza · Posted 10-26-2021 03:38 PM

Drug stability over time? Doesn't that usually require survival analysis?

It's been a while since I've done a clinical trial though (a decade).

nsns · Posted 10-27-2021 03:58 AM

The guidelines say specifically that "Regression analysis is considered an appropriate approach to evaluating the stability data for a quantitative attribute and establishing a retest period or shelf life." It also says "An appropriate approach to retest period or shelf life estimation is to analyze a quantitative
attribute (e.g., assay, degradation products) by determining the earliest time at which the 95
percent confidence limit for the mean intersects the proposed acceptance criterion"

The figure in the guidelines shows data collected up to 12 months and then extrapolation to show the degradation would be within the acceptance range at 24 months (the guidelines have rules for how long you may extrapolate and also will need to update with longer term data - but this is acceptable).

So I think that I am ok with using the regression procedure.

The output dataset that is generated using the sample data above includes the missing points. This is what it generates:

time	result	Predicted Value	Lower Bound	Upper Bound
		of result	of 95% C.I.	of 95% C.I.
			for Mean	for Mean
0	13	17.9643	11.841	24.0876
3	23	21.0714	16.2679	25.8749
6	26	24.1786	20.3811	27.9761
9	32	27.2857	23.8891	30.6823
12	30	30.3929	26.5954	34.1904
15	33	33.5	28.6965	38.3035
18	34	36.6071	30.4838	42.7304
21	.	39.7143	32.1193	47.3093
24	.	42.8214	33.6758	51.967

What I want to understand is how the 95% confidence interval in the proc reg procedure is calculated for the extended period - in my example above, that is for months 21 and 24.

(I am still working through your links).

Thanks for your help.

nsns · Posted 10-31-2021 04:16 AM

The guidelines specifically say regression analysis. Since this is what is requested, I would rather stick with the regression. Thanks for the links that you sent. They are interesting and helpful and give insight - although I am not sure that they resolve my question. Will continue to review them. Thanks.

Ksharp · Posted 10-27-2021 08:59 AM

If you need "extrapolation of regression", then you need SAS/ETS .
I suggest post your question at Forecasting Forum .
Check PROC ARIMA , PROC UCM , PROC ESM ,PROC FORECAST ......

nsns · Posted 10-31-2021 04:17 AM

It's been a very long time since I've done time series and forecasting - I will look into this however, I feel that since the FDA guidelines specifically say regression - I should work with this. Thanks for your input and your suggestion.

Rick_SAS · Posted 10-27-2021 10:02 AM

Others have already raised questions about whether PROC REG is the appropriate tool, but the answer to your question is found in the SAS documentation for predicted values. See the equations for LowerM and UpperM.

SteveDenham · Posted 10-27-2021 10:27 AM

It helps to have some experience with regulators on this. Linear extrapolation of stability data has been done for years. While that may not make it "right", there is an impressive track record. The problem with shifting to any of the time series methods is that the number of time points on these studies is often too small to be able to estimate the parameters, the time points are not equally spaced, and the measurements are not on the same sample (i.e., several samples are taken at the initiation of the stability testing period, and then destructively analyzed at pre-determined time points). Consequently, a lot of time series methods aren't robust enough to deal with this.

As far as survival analysis, I think you would need multiple samples to analyze at each pre-determined time point to be able to estimate the hazard ratio (failure rate). Unless the synthesis has been scaled up to a near production level, there may not be sufficient test article to carry out a decent survival analysis.

So for once, I can understand the use of a not-quite-right method.

SteveDenham

nsns · Posted 10-31-2021 04:21 AM

Thanks for your reply Steve. I agree with you. Additionally, I think that since the FDA has included specific conditions for the stability assessments - i.e how far out you may go etc, and that long term testing has to follow the extrapolated values (for confirmation, I believe), I think that using this method is ok. Thanks for your input.

nsns · Posted 10-31-2021 04:13 AM

Thanks for the link

Statistical Procedures

Proc Reg - extrapolation of regression line with confidence intervals

Re: Proc Reg - extrapolation of regression line with confidence intervals

Re: Proc Reg - extrapolation of regression line with confidence intervals

Re: Proc Reg - extrapolation of regression line with confidence intervals

Re: Proc Reg - extrapolation of regression line with confidence intervals

Re: Proc Reg - extrapolation of regression line with confidence intervals

Re: Proc Reg - extrapolation of regression line with confidence intervals

Re: Proc Reg - extrapolation of regression line with confidence intervals

Re: Proc Reg - extrapolation of regression line with confidence intervals

Re: Proc Reg - extrapolation of regression line with confidence intervals

Re: Proc Reg - extrapolation of regression line with confidence intervals

Re: Proc Reg - extrapolation of regression line with confidence intervals

Re: Proc Reg - extrapolation of regression line with confidence intervals

Follow Us

What is...

Statistical Procedures

Our biggest data and AI event of the year.

Follow Us

What is...