BookmarkSubscribeRSS Feed
gbrussell
Calcite | Level 5

I have data on surgery number (1-25) and surgery minutes by the same physician; he wants to know when his data plateaus. I've seen the example (https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=statug&docsetTarget=statu...) but am not sure of the parameters I need to specify for this. Any help would be welcomed.  I'd like to plot the data; here's what the data look like (surgery #, date, length of procedure):

13/20/2019182
25/16/2019150
35/30/2019223
46/6/2019142
56/11/2019111
67/11/2019164
77/26/201983
88/22/2019144
98/29/2019162
109/19/201983
1110/10/201970
1210/17/2019114
1310/31/2019113
1411/7/201997
1511/21/201983
1612/5/2019111
1712/5/201973
1812/12/201987
1912/19/201986
201/9/2020102
211/16/2020124
221/23/202095
231/30/2020134
243/5/2020121
256/4/202060
10 REPLIES 10
ballardw
Super User

Here's an example of 1) how to provide example data and 2) a simple plot of the data

data have;
  input SurgeryNo date :mmddyy10. duration;
  format date mmddyy10.;
datalines;
1	3/20/2019	182
2	5/16/2019	150
3	5/30/2019	223
4	6/6/2019	142
5	6/11/2019	111
6	7/11/2019	164
7	7/26/2019	83
8	8/22/2019	144
9	8/29/2019	162
10	9/19/2019	83
11	10/10/2019	70
12	10/17/2019	114
13	10/31/2019	113
14	11/7/2019	97
15	11/21/2019	83
16	12/5/2019	111
17	12/5/2019	73
18	12/12/2019	87
19	12/19/2019	86
20	1/9/2020	102
21	1/16/2020	124
22	1/23/2020	95
23	1/30/2020	134
24	3/5/2020	121
25	6/4/2020	60
;

proc sgplot data=have;
  scatter x=date y=duration;
run;

You might add a regression of degree 2,

proc sgplot data=have;
  scatter x=date y=duration;
  reg x=date y=duration/degree=2 ;
run;

If you want to fit a model then implement the document you site. Maybe.

gbrussell
Calcite | Level 5

The model doesn't work for these data, at least not using the alpha, beta, gamma, X0 in the link.  That's where I need a bit of help.

Thanks!

gbrussell
Calcite | Level 5

Thank you for the plot/model -- it works very well and will help provide what I need to report.  I appreciate your time.

 

mkeintz
PROC Star

If you drop the minimum and maximum durations in this data, you no longer see a "plateau".  The plot and regression curve changes from

 

SGPlot5.png

to 

 

SGPlot6.png

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PGStats
Opal | Level 21

A robust regression (fitting at the median) doesn't show much evidence of a time effect:

 

proc quantreg data=have ci=resampling;
effect dPoly=poly(date / degree=2 standardize=center details);
model duration = dPoly;
run;

PGStats_0-1606780796273.png

most of the effect that we see in the scatter plot is due to 2 or 3 data points.

PG
SteveDenham
Jade | Level 19

I tried a variety of splines (penalized B, thin-plate and loess) to see if there was any indication of something that might look like a plateau, without using a polynomial fit.  The penalized B gave something that looked like this:

SteveDenham_0-1606834072413.png

The other methods showed a smooth decrease.  Removing the last date from the penalized B gives something that turns back up (no plateau), while the other two methods give;

Loess:

SteveDenham_1-1606834435868.png

Thin plate spline:

SteveDenham_2-1606834528886.png

 

These give some support to a plateau being reached some time in November 2019.  That last point has a tremendous amount of leverage, and really shifts the resulting plots.  Code for these follows:

 

proc sgplot data=have;
  scatter x=date y=duration;
  pbspline x=date y=duration;
run;

ods graphics on;
proc tpspline data=have plots(only)=(criterionplot fitplot(clm));
   model duration = (date) /alpha = 0.1;
   output out = result pred uclm lclm;
run;

proc loess data=have;
   model duration=date;
run;

/* Delete the last point */
data have2;
set have;
if surgeryno=25 then delete;
run;

proc sgplot data=have2;
  scatter x=date y=duration;
  pbspline x=date y=duration;
run;

proc tpspline data=have2 plots(only)=(criterionplot fitplot(clm));
   model duration = (date) /alpha = 0.1;
   output out = result pred uclm lclm;
run;

proc loess data=have2;
   model duration=date;
run;

SteveDenham

 

 

 

gbrussell
Calcite | Level 5

Thank you very much for your time.  This is quite helpful and thorough.

Rick_SAS
SAS Super FREQ

> The model doesn't work for these data, at least not using the alpha, beta, gamma, X0 in the link.

  

It is not surprising that the initial guesses for the documentation example do not work for your data, which has a completely different scale. The initial values of the parameters should be somewhat close to the final (optimal) values. For your data, I would use a simple quadratic regression model to estimate the parameters. Then the NLIN Procedure fits the data without difficulty and reports a plateau of 98 minutes, reached on 01JAN2020.


/* rescale by using x="days since start" as the variable */
%let RefDate = '20MAR2019'd;
data New;
set Have;
rename duration=y;
x = date - &RefDate;  /* days since first record */
run;

proc glm data=New;
 model y = x x*x;
run;

title 'Quadratic Model with Plateau';
proc nlin data=New plots=fit;
   parms alpha=184 beta= -0.5 gamma= 0.001;

   x0 = -.5*beta / gamma;
   if (x < x0) then
        mean = alpha + beta*x  + gamma*x*x;
   else mean = alpha + beta*x0 + gamma*x0*x0;
   model y = mean;

   estimate 'plateau' alpha + beta*x0 + gamma*x0*x0;
   estimate 'x0' -beta / (2*gamma);
   output out=b predicted=yp L95M=lower U95M=upper;
run;

/* convert x back to original scale (date) */
data _NULL_;
plateauDate = 287 + &RefDate;
call symput('x0', plateauDate);
put plateauDate DATE10.;
run;
%put &=x0;

proc sgplot data=b noautolegend;
   band x=date lower=lower upper=upper;
   refline 97.97  / axis=y label="Plateau"    labelpos=max;
   refline &x0 / axis=x label="01JAN2020" labelpos=max;
   scatter x=date y=y;
   series  x=date y=yp;
   yaxis label='Duration';
run;

SGPlot14.png

gbrussell
Calcite | Level 5

Thank you for the post.  I could not find an example or documentation on selecting a better value.  I appreciate your time.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 1936 views
  • 5 likes
  • 6 in conversation