Interrupted time series-Segmented regression

am12scorp · Posted 11-22-2020 05:49 PM

I would like to use Interrupted time series analysis for 2007-2015 data and the policy aimed to reduce the use of certain ‘low-value’ medical procedure after disease diagnosis was implemented in May 2012. I am very new to the interrupted time series analysis and would need the guidance from you all experts.

My outcome, use of low-value medical procedure is a binary variable coded as ‘0’ for patients with no use and ‘1’ for those with use after their disease diagnosis. I created variables time in months (month_dx) (1 through 108 for 2007 to 2015 data), ‘policy’ (0 for pre-policy period, and 1 for post-policy period), and the interaction of time and policy (time_after_policy). I ran the following program (unadjusted) and got the following output where I did not adjust for any covariates.

proc autoreg data = imaging outest=parapst covout;

model imaging = month_dx policy time_after_policy

/method=ml nlag=12 dwprob loglikl covb;

output out=pred p=predict r=resid;

run;

Autoregressive parameters assumed given
Variable	DF	Estimate	Standard Error	t Value	Approx Pr > \|t\|	Variable Label
Intercept	1	0.5558	0.005242	106.04	<.0001
month_dx	1	-0.000457	0.000127	-3.58	0.0003	month of cancer dx
policy	1	-0.003365	0.008428	-0.40	0.6897	policy yes/no
time_after_policy	1	-0.000625	0.000297	-2.10	0.0357	time since policy

Durbin-Watson value of 2.0016

How do I interpret the values for policy (-0.003365) and time_after_policy (-0.000625)?

In the attached document with figures for diagnostics, I found that the figure for residuals is bimodal as my outcome is binary. Is it normal to have a bimodal figure for residuals with binary outcome? Or do I have to change the outcome data to proportion of patients receiving low-value medical procedure?

After I adjust for the covariates (age, race/ethnicity, socioeconomic characteristics, disease severity, etc.), I get the following output:

proc autoreg data = imaging outest=parapst covout;

model imaging = month_dx policy time_after_policy agegrp region raceeth income educ grade cci gleason psa_level tstage

/method=ml nlag=12 dwprob loglikl covb;

output out=pred p=predict r=resid;

run;

Autoregressive parameters assumed given
Variable	DF	Estimate	Standard Error	t Value	Approx Pr > \|t\|	Variable Label
Intercept	1	-0.0551	0.0176	-3.13	0.0018
month_dx	1	0.001058	0.000176	6.00	<.0001	month of cancer dx
policy	1	-0.0434	0.008090	-5.36	<.0001	policy yes/no
time_after_policy	1	0.001558	0.000344	4.52	<.0001	time since policy
agegrp	1	0.0176	0.001864	9.42	<.0001	Age groups 4 categories
region	1	-0.0569	0.002227	-25.56	<.0001	US region
raceeth	1	0.003842	0.002160	1.78	0.0753	1 wh 2 aa 3 hisp 4 asian 5 oth
income	1	0.0312	0.004939	6.31	<.0001	income
educ	1	-0.0282	0.003466	-8.14	<.0001	education
grade	1	0.1567	0.003447	45.45	<.0001	grade
cci	1	0.0528	0.002139	24.70	<.0001	cci index
gleason	1	0.0561	0.003938	14.25	<.0001	gleason
psa_level	1	0.0302	0.001399	21.59	<.0001	psa_level
tstage	1	0.008183	0.002040	4.01	<.0001	T stage

Durbin-Watson value of 2.0008

I see that the estimate for intercept becomes negative in the adjusted regression. Also the estimate of ‘time after policy’ becomes positive (p<0.0001) indicating that policy change increased the use of ‘low-value’ medical procedure.

Please let me know if I followed the correct steps. Also please let me know if we need to include all the covariates in the adjusted regression for interrupted time series. Any guidance will be greatly appreciated.

Interrupted time series-Segmented regression

SAS Innovate 2025: Save the Date