I've seen this problem a couple of times when I try to make a spaghetti plot (series for every individual). When I run the code below, the graph ends up with back-and-forth trajectories as if people are traveling backwards in time. I also tried SORT, but it does not seem to help. Any help or advice would be most appreciated!
data long;
call streaminit(1234);
do p = 1 to 50; * persons;
* personal intercept;
int = 10 + 2 * rand("Normal");
* personal slope;
slo = 1 + rand("Normal");
* treatment dummy;
if p LT 25 then trt = 0;
else trt = 1;
* time loop;
do t = 1 to 6;
*regression model;
Y = int + (t-1) * slo /* basic growth */
+ .2 * trt * (t-1)**2 /* treatment effect */
+ 1.5 * rand("Normal"); /* time-specific error*/
output;
end;
end;
run;
proc sort data=long; by p t; run;
* spaghetti plot;
proc sgplot data=long noautolegend;
series Y = Y X = t / group = p lineattrs=(color=ligr pattern=1);
reg Y = Y X = t / group = trt nomarkers;
run;
The result shows trajectories going back and forth in time (X).
proc sort data=long;
by TRT p t; /* this seems to be doing the trick */
run;
Looks buggish to me. You might want to submit to tech support.
That sometimes happens if you have too many groups, but in that case you get a warning in the log. That doesn't seem to be the problem here.
If you comment out the REG plot, you get don't get the connected zigzag.
proc sgplot data=long noautolegend;
series Y = Y X = t / group = p lineattrs=(color=ligr pattern=1);
* reg Y = Y X = t / group = trt nomarkers;
run;
If you leave the reg plot but change it to group by P, you don't get the zigzag:
proc sgplot data=long noautolegend;
series Y = Y X = t / group = p lineattrs=(color=ligr pattern=1);
reg Y = Y X = t / group = p nomarkers;
run;
If you replace the reg plot with a scatter plot, you don't get the zigzag
proc sgplot data=long noautolegend;
series Y = Y X = t / group = p lineattrs=(color=ligr pattern=1);
scatter Y = Y X = t /group=trt ;
run;
So seems like there is an odd interaction (bug-ish) where having the REG plot overlaid and using a different GROUP variable is causing a problem for the SERIES plot grouping.
Create separate Y variable values for each treatment and separate REG plot statements for each as well without any group variable as a work around.
You left art out in your sort. Try sorting by "trt p t", and see if you get what you expect.
Interestingly, dataset LONG is already sorted by trt p t, yet sorting by trt makes a difference to the plot. So, apparently PROC SGPLOT uses the sort information metadata.
Correct, @FreelanceReinh 🙂
Here is the full explanation of what you saw:
Grouped fits (reg, loess, and pbspline) must be sorted by their group values to be processed correctly. If the data is not already in the correct order, the procedure will sort it internally. For scatter plots (which are typically used), this has no impact. However, for a SERIES plot, this can cause the line to go backwards to follow the data. The best practice for this special case is to sort your data, starting with the group variable from the fit.
proc sort data=long;
by TRT p t; /* this seems to be doing the trick */
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.