The example given at R seems show that EST/STDERR for each slope is not a good tell (slope1 abs-t-value <1.0).
The visual tells a segmented line there.
My quest is how to tell whether 2-step/segment is better than 1-step (or 3-step is better than 2-step).
> slope(o)
$x
Est. St.Err. t value CI(95%).l CI(95%).u
slope1 -0.058993 0.062105 -0.94991 -0.18230 0.06431700
slope2 1.414600 0.046151 30.65100 1.32290 1.50620000
slope3 -0.142800 0.071994 -1.98350 -0.28575 0.00014734
# http://127.0.0.1:16212/library/segmented/html/segmented.html
set.seed(12)
xx<-1:100
zz<-runif(100)
yy<-2+1.5*pmax(xx-35,0)-1.5*pmax(xx-70,0)+15*pmax(zz-.5,0)+rnorm(100,0,2)
dati<-data.frame(x=xx,y=yy,z=zz)
out.lm<-lm(y~x,data=dati)
#the simplest example: the starting model includes just 1 covariate
#.. and 1 breakpoint has to be estimated for that
o<-segmented(out.lm) #1 breakpoint for x
#the single segmented variable is not in the starting model and 1 breakpoint for that:
#... you need to specify the variable via seg.Z, but no starting value for psi
o<-segmented(out.lm, seg.Z=~z)
#note the leftmost slope is constrained to be zero (since out.lm does not include z)
#2 segmented variables, 1 breakpoint each (again no need to specify npsi or psi)
o<-segmented(out.lm,seg.Z=~z+x)
#1 segmented variable, 2 breakpoints: you have to specify starting values (vector) for psi:
o<-segmented(out.lm,seg.Z=~x,psi=c(30,60), control=seg.control(display=FALSE))
#or by specifying just the *number* of breakpoints
#o<-segmented(out.lm,seg.Z=~x, npsi=2, control=seg.control(display=FALSE))
slope(o) #the slopes of the segmented relationship
dat2 = data.frame(x = xx, y = broken.line(o)$fit)
ggplot(dati, aes(x = x, y = y)) +
geom_point() +
geom_line(data = dat2, color = 'blue')
Well, EST/STDERR tells slopex=0.0 or not. But still any criteria to tell whether take n-segment rather than (n-1) segment?!
The usual way to compare two different regression models is to compare some criteria such as the AIC, BIC, or SBC. These criteria are not produced automatically by PROC NLIN, but you can use PROC NLMIXED (which has a similar syntax) to get them. Look at the "Fit Statistics" table.
Would you give out an example with dataset or a link?! Dear Paige gives out a great link on hypo test between
models, https://www.sfu.ca/~lockhart/richard/350/08_2/lectures/FTests/web.pdf
Somehow, when I run my toy dataset, the F-test shows up confusing.
%let slope1=-2; %let slope2=5; %let turner=95;
data indata;
do ind=1 to 100;
if ind<&turner. then y=10*ranuni(ind)+ind*&slope1.;
else y=20*ranuni(ind)+ind*&slope2.-&turner.*&slope2.+&turner.*&slope1.;
output;
end;
run;quit;
ods listing gpath="%sysfunc(getoption(work))";
proc sgplot data=indata;
scatter x=ind y=y;
run;quit;
%let inds=indata;
%let xvar=ind; %let yvar=y;
proc nlin data=&inds.;
parms intcpt0=0 slope1=0 slope2=0 x0=&st_x0;
if (ind < x0) then
mean = intcpt0+ind*slope1;
else mean = ind*slope2-x0*slope2+x0*slope1+intcpt0;
model &yvar. = mean;
output out=&inds._sel_out predicted=&yvar._p;
run;
proc reg data=&inds.;
model y=ind;run;quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.