I received the SAS text "Longitudinal Data Analysis with Discrete and Continuous Responses" a couple of days ago. The first sections of the book show how to draw spaghetti plots and add a fitted spline with sgplot. The problem I'm having is when attempting to draw the spaghetti plot by patient ID but then to draw 2 (or more) different splines by a grouping variable.
The book shows the following code, but when I run it I get the error shown. That is, it seems that the group variable cannot be different between the "series" statement and the "pbspline" statement. Can anyone help?
6060 goptions reset=all;
6061 proc sgplot data=sasuser.aids nocycleattrs noautolegend;
6062 series y=cd4 x=time / group=id transparency=0.5
6063 lineattrs=(color=cyan pattern=1);
6064 pbspline y=cd4 x=time / group=drug nomarkers smooth=50 nknots=5
6065 lineattrs=(thickness=3) name="drug";
6066 xaxis values=(-3 to 5.5 by 0.5) label='Years since Seroconversion';
6067 yaxis values=(0 to 3500 by 500) label='CD4 Cell Counts';
6068 keylegend "drug";
6069 format drug druggrp.;
6070 title 'Individual Profiles with Drug Usage Subgroups';
WARNING: Since the group variable on the PBSPLINE statement is different from the first
specified group variable, the group on the PBSPLINE statement will be ignored.
Just realized something. Despite the code that the book shows, the figures that they include are clearly derived from keeping the group the same between the "series" statement and the "pbspline" statement. That's really frustrating. That is, they must have created their figure using:
series y=cd4 x=time / group=drug;
pbspline y=cd4 x=time / group=drug;
even though plotting cd4 x time by drug doesn't really make any sense.
So, I guess it's probably NOT possible to have different groups in the 2 statements?
Yes, SGPLOT requires the group varialbe for the plots be the same. You can use the generated GTL code for the graph as follows:
- Add the tmplout="filename" option on the SGPLOT procedure statement.
- The generated GTL template code will be written to this file.
- Copy the GTL template code from this file to the program editor window.
- Add the GROUP= option to the PBSPLINE statement.
- The name of this template on the DEFINE statement is "sgplot".
- Run the template with your data:
The restriction for multiple group variables occurs because of the fit statement (pbspline). Had the statements been basic plots (scatter, series, needle, step, vector), the different groups would be allowed. The restriction for a single grouped fit statement and different group variables on other statements will be lifted in SAS 9.3.