Restricted Cubic Spline Cut Point

PeterBr · Posted 12-18-2021 12:52 PM

Hi All,

Using the dataset below, I am trying to find inflection points for surgeon volume in relation to complications. I have read that restricted cubic splines can do such a thing but I am not sure how to approach it in SAS. As my data is, I am assuming I will have to do a spline with a logistic regression (complication = age insurance surg_volume) where surg_volume is a spline. Eventually after establishing these inflection points, I am going to do a separate analysis where I define surgeons as either low volume or high volume based on the spline knots and then run a survival analysis to compare low vs high volume surgeons.

The dataset below is randomly generated but is the same format as my current dataset.

data have;                     
   input complication age insurance $ surg_volume;
   ;    
   datalines; 
0	84	cigna	11
1	84	aetna	45
0	16	blue	138
1	75	cigna	116
0	46	aetna	134
1	50	blue	118
1	49	cigna	129
1	13	aetna	101
0	43	blue	65
1	32	cigna	14
1	21	aetna	87
0	29	blue	10
0	82	cigna	127
0	16	aetna	61
1	15	blue	21
0	40	cigna	81
0	63	aetna	80
1	69	blue	72
1	21	cigna	27
0	13	aetna	84
0	26	blue	7
0	46	cigna	64
0	35	aetna	10
0	18	blue	75
0	18	cigna	19
0	15	aetna	111
1	36	blue	16
1	16	cigna	130
1	86	aetna	56
0	44	blue	19
0	79	cigna	120
1	29	aetna	70
0	52	blue	94
1	37	cigna	26
0	67	aetna	33
1	49	blue	61
1	31	cigna	54
1	20	aetna	81
0	31	blue	79
1	63	cigna	91
0	50	aetna	131
0	55	blue	18
0	66	cigna	3
0	62	aetna	17
0	79	blue	124
0	82	cigna	21
1	81	aetna	48
1	59	blue	103
0	70	cigna	138
1	19	aetna	64
1	63	blue	147
1	36	cigna	17
1	87	aetna	102
0	63	blue	60
0	18	cigna	114
0	31	aetna	124
1	37	blue	67
0	12	cigna	149
0	42	aetna	95
1	74	blue	118
0	75	cigna	58
0	19	aetna	111
1	31	blue	113
1	26	cigna	53
0	20	aetna	140
0	66	blue	38
1	54	cigna	60
1	47	aetna	135
1	79	blue	121
0	31	cigna	82
1	80	aetna	40
1	59	blue	79
1	18	cigna	87
0	34	aetna	111
1	77	blue	65
1	14	cigna	54
1	59	aetna	39
1	61	blue	48
1	32	cigna	137
1	28	aetna	144
0	21	blue	120
1	17	cigna	58
0	55	aetna	145
1	56	blue	75
0	69	cigna	119
1	15	aetna	105
1	17	blue	130
0	84	cigna	17
0	20	aetna	75
0	75	blue	8
0	84	cigna	101
0	44	aetna	100
1	26	blue	133
0	13	cigna	108
0	61	aetna	85
1	21	blue	119
0	15	cigna	7
0	42	aetna	48
1	75	blue	108
0	70	cigna	13
0	51	aetna	150
1	72	blue	145
0	19	cigna	132
1	81	aetna	32
0	36	blue	134
1	36	cigna	110
0	48	aetna	102
0	87	blue	121
0	65	cigna	64
1	34	aetna	96
1	52	blue	119
1	40	cigna	75
0	32	aetna	92
0	19	blue	56
0	13	cigna	128
0	43	aetna	70
1	75	blue	102
1	81	cigna	134
0	17	aetna	20
1	68	blue	3
1	85	cigna	139
0	87	aetna	32
1	72	blue	135
1	30	cigna	113
1	43	aetna	113
1	57	blue	118
0	74	cigna	2
0	79	aetna	23
0	81	blue	42
1	49	cigna	125
0	65	aetna	76
1	59	blue	145
0	26	cigna	112
0	84	aetna	33
1	23	blue	100
1	40	cigna	124
0	35	aetna	98
1	60	blue	102
0	12	cigna	137
1	70	aetna	78
1	83	blue	11
1	60	cigna	120
1	87	aetna	97
0	87	blue	34
1	48	cigna	2
0	14	aetna	23
0	74	blue	41
0	48	cigna	119
1	51	aetna	6
0	78	blue	37
1	42	cigna	63
1	13	aetna	141
1	57	blue	96
0	76	cigna	107
0	66	aetna	30
0	43	blue	94
1	61	cigna	148
0	16	aetna	7
1	41	blue	90
1	56	cigna	117
1	73	aetna	15
1	66	blue	31
0	37	cigna	34
0	22	aetna	16
1	59	blue	68
;

Rick_SAS · Posted 12-19-2021 06:15 AM

You can use the EFFECT statement to create the spline and use the EFFECTPLOT statement to visualize it. Try this:

proc logistic data=Have;
class insurance;
effect spl = spline(surg_volume/ details naturalcubic basis=tpf(noint)
             knotmethod=percentiles(5));
             /* or in SAS 9.4M6: knotmethod=percentilelist(5 27.5 50 72.5 95) ); */
model complication(event='1') = age insurance spl;
effectplot slicefit(x=surg_volume sliceby=insurance) / obs;
run;

Some articles that explain the various techniques:

PeterBr · Posted 12-19-2021 09:09 AM

Thank you for your reply - it is extremely helpful. Your suggested code is similar to what I was previously using. Perhaps what has been confusing to me is how folks are using restricted cubic splines to identify inflection points. Using KNOTMETHOD=PERCENTILES(5), this should place the knots at the equal percentiles but not necessarily identify the inflection points. Do you have any advice on how to identify the inflection points?

I could also convert to a complication rate for each surg_volume to make a continuous dependent variable if that makes the analysis easier.

Rick_SAS · Posted 12-19-2021 04:33 PM

> Do you have any advice on how to identify the inflection points?

I do not have any good advice. In the sample code I posted, you can estimate the "elbow" from the graph of the effect plot. Unfortunately, when you have other covariates such as age and insurance, there is not likely to be "THE" inflection point. The location of the elbow could depend on other factors. I would also guess that some doctors (due to temperament and experience) are better at handling high volumes than others.

If you want the location of the elbow to be estimated by the model, you need to include it as a parameter in the model, which leads to piecewise regression models (link in my previous reply).

Restricted Cubic Spline Cut Point

Re: Restricted Cubic Spline Cut Point

Re: Restricted Cubic Spline Cut Point

Re: Restricted Cubic Spline Cut Point

Restricted Cubic Spline Cut Point

Re: Restricted Cubic Spline Cut Point

Re: Restricted Cubic Spline Cut Point

Re: Restricted Cubic Spline Cut Point

Register Today!