BookmarkSubscribeRSS Feed
pcl
Calcite | Level 5 pcl
Calcite | Level 5

I'm trying to create a spaghetti plot to compare a treatment group to a control group, with a color change or other style change once an individual reaches a certain period of time. 

Each patient has up to six interviews, with the assessment given at every interview. At some point into each participant's progression, Covid begins. We want to see the line format change when they attend interviews during Covid. Since each person started at different dates, some participant's interviews are mostly pre-covid or during covid. Each interview is numbered 0, 1, 3, 6, 9, 12 based on the anticipated month into the study. 

 

Data are in long format, with an indicator variable for when their interview took place during covid. SIQTotal is the assessment of interest. This snippet just happens to have covid happen in their last assessment, but the whole data are more varied. 

 

data WORK.SPAGPLOT;
  infile datalines dsd truncover;
  input participant_id:BEST12. intervention:BEST12. interview:BEST12. covid_era:BEST12. SIQTotal:BEST12.;
  format participant_id BEST12. intervention BEST12. interview BEST12. covid_era BEST12. SIQTotal BEST12.;
  label participant_id="participant_id";
datalines;
1 0 0 0 165
1 0 1 0 172
1 0 3 0 163
1 0 6 0 172
1 0 9 0 133
1 0 12 1 180
2 0 0 0 70
2 0 1 0 59
2 0 3 0 58
2 0 6 0 60
2 0 9 0 59
2 0 12 1 58
;;;;

I'm pretty new to graphing in SAS and I've been using sgplot for the first time, used this code just to compare the intervention group vs the control group but now want to add a format change to indicate the covid-period interviews (maybe something like solid red to dashed red, solid blue to dashed blue? or Red to some other color, Blue to some other color?).

proc sgplot data=work.spagplot;
	series x=interview y=siqtotal/group=participant_id grouplc=intervention name='grouping';
	xaxis values=(0 to 12 by 3);
	keylegend 'grouping'/type=linecolor;
title 'SIQTotal individual scores over interview';
format intervention intervention.;
run;

 Any advice would be appreciated. 

 

3 REPLIES 3
PaigeMiller
Diamond | Level 26

Formats would only change the appearance of text on the plot, you want something that changes the appearance of the plot itself. How about this:

 

data WORK.SPAGPLOT;
  infile datalines truncover;
  input participant_id intervention interview  covid_era $ SIQTotal ;
  label participant_id="participant_id";
datalines;
1 0 0 dot 165
1 0 1 dot 172
1 0 3 dot 163
1 0 6 dot 172
1 0 9 dot 133
1 0 12 plus 180
2 0 0 dot 70
2 0 1 dot 59
2 0 3 dot 58
2 0 6 dot 60
2 0 9 dot 59
2 0 12 plus 58
;

proc sgplot data=work.spagplot;
	series x=interview y=siqtotal/group=participant_id groupms=covid_era markers;
	xaxis values=(0 to 12 by 3);
title 'SIQTotal individual scores over interview';
/*format intervention intervention.;*/
run;

That's my guess as to what you are asking for. If that's not it, please provide more details.

--
Paige Miller
ballardw
Super User

When would you want the color of the line to change? Before the covid_era is the only way that makes sense with a single covid_era value, other wise there is no line segment to color.

 

 

It might be much easier to use MARKERS to indicate a two-level value like your covid_era.

 

proc sgplot data=work.spagplot;
	series x=interview y=siqtotal
   /group=participant_id grouplc=intervention name='grouping'
    markers groupms=covid_era
   ;
	xaxis values=(0 to 12 by 3);
	keylegend 'grouping'/type=linecolor;
title 'SIQTotal individual scores over interview';
format intervention intervention.;
run;

With many participants you will have to be responsible to add different group variables and overlay plots in manner to change the line color/pattern in a meaningful sense.

 

Also, when I run your data step as written I get this:

10   data WORK.SPAGPLOT;
11     infile datalines dsd truncover;
12     input participant_id:BEST12. intervention:BEST12. interview:BEST12. covid_era:BEST12.
12 ! SIQTotal:BEST12.;
13     format participant_id BEST12. intervention BEST12. interview BEST12. covid_era BEST12.
13 ! SIQTotal BEST12.;
14     label participant_id="participant_id";
15   datalines;

NOTE: Invalid data for participant_id in line 16 1-11.
RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+--
16         1 0 0 0 165
participant_id=. intervention=. interview=. covid_era=. SIQTotal=. _ERROR_=1 _N_=1
NOTE: Invalid data for participant_id in line 17 1-11.
17         1 0 1 0 172
participant_id=. intervention=. interview=. covid_era=. SIQTotal=. _ERROR_=1 _N_=2
NOTE: Invalid data for participant_id in line 18 1-11.
18         1 0 3 0 163
participant_id=. intervention=. interview=. covid_era=. SIQTotal=. _ERROR_=1 _N_=3
NOTE: Invalid data for participant_id in line 19 1-11.
19         1 0 6 0 172
participant_id=. intervention=. interview=. covid_era=. SIQTotal=. _ERROR_=1 _N_=4
NOTE: Invalid data for participant_id in line 20 1-11.
20         1 0 9 0 133
participant_id=. intervention=. interview=. covid_era=. SIQTotal=. _ERROR_=1 _N_=5
NOTE: Invalid data for participant_id in line 21 1-12.
21         1 0 12 1 180
participant_id=. intervention=. interview=. covid_era=. SIQTotal=. _ERROR_=1 _N_=6
NOTE: Invalid data for participant_id in line 22 1-10.
22         2 0 0 0 70
participant_id=. intervention=. interview=. covid_era=. SIQTotal=. _ERROR_=1 _N_=7
NOTE: Invalid data for participant_id in line 23 1-10.
23         2 0 1 0 59
participant_id=. intervention=. interview=. covid_era=. SIQTotal=. _ERROR_=1 _N_=8
NOTE: Invalid data for participant_id in line 24 1-10.
24         2 0 3 0 58
participant_id=. intervention=. interview=. covid_era=. SIQTotal=. _ERROR_=1 _N_=9
NOTE: Invalid data for participant_id in line 25 1-10.
25         2 0 6 0 60
participant_id=. intervention=. interview=. covid_era=. SIQTotal=. _ERROR_=1 _N_=10
NOTE: Invalid data for participant_id in line 26 1-10.
26         2 0 9 0 59
participant_id=. intervention=. interview=. covid_era=. SIQTotal=. _ERROR_=1 _N_=11
NOTE: Invalid data for participant_id in line 27 1-11.
27         2 0 12 1 58
participant_id=. intervention=. interview=. covid_era=. SIQTotal=. _ERROR_=1 _N_=12
NOTE: The data set WORK.SPAGPLOT has 12 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.00 seconds

The DSD option is usually not a good idea when using LIST input unless you really know what is going on as the single space is the default delimiter but DSD means that delimiters may appear in the values. And you just add work by typing :BEST12. informats when that is what SAS would default to when nothing is provided.

 

Jay54
Meteorite | Level 14

You might find some useful information here.  Similar features are also available in the SGPLOT procedure.

https://blogs.sas.com/content/graphicallyspeaking/2014/08/16/more-on-spaghetti-plots/

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 817 views
  • 0 likes
  • 4 in conversation