BookmarkSubscribeRSS Feed
Fluorite | Level 6

Dear all,

I am working with some longitudinal data, where I am making a spaghetti-plot for each patient. 
I need to "mark" the occurrence of some dates in the plot, and I have thought about doing that with a dummy-variable and use the colourresponse option, but I can't get it to do it right. 

So, below is a test-dataset containing patientid, visit_date, measurement (measure), age at measure (age_measure), date of diagnosis 1 (dx1), date of diagnosis 2 (dx2). The desired output is a spaghetti-plot with patient id as group, and where the color changes when the visitdate passes the diagnosis date. 
In the code below I have only made the dummy variable contain diagnosis 1, it would be a great help to show me, how to incorporate the date of diagnosis 2 also. 
if it helps, I can change the dates to corresponding age_measure values. 

Thanks 🙂



data patients;
input patientid $ (dx1 dx2) (:yymmdd8.);
format dx1 dx2 yymmdd10.;
1 20180101 20180110
2 20170205 .
3 20170221 20170225
4 20180101 20180202
5 . 20180503

data visits;
input patientid $ visit_date :yymmdd8. measure age_measure;
format visit_date yymmdd10.;
1  20180101  3.4   10
1  20180505  2.3   15
2  20170210  7.3   20
2  20170217  7.2   25
2  20170220  7.1   30
3  20170221  5.4   35
4  20180202  3.4   23
4  20180204  3.2   25
5  20180504  5.6   30
5  20180505  5.0   32
data have;
by patientid;

/* The plot */

data have;
set have;
If .<dx1 ge age_measure
then dummy_var = 2;
If .<dx1 LT age_measure then dummy_var=1;
else dummy_var=.;

proc sgplot data=have noborder subpixel;
    series x=age_measure y=measure / group=patientid colorresponse=dummy_var colormodel=(red gold green) lineattrs=(thickness=2);
    xaxis display=(noline noticks nolabel) grid;
    yaxis display=(noline noticks nolabel) grid;



For an overview of creating and coloring groups in spaghetti plots, see "Create spaghetti plots in SAS".


I don't understand how the age and date variables are related, but my advice is to create binary variables 'passed1' and 'passed2' to indicate whether the patient has passed the dx1 and dx2 dates. You can then assign value 0-3 to a grouping variable ('Color') according to whether the patient has passed none, dx1, dx2, or both dates.  (Are dx1 and dx2 independent, or is dx1 < dx2? If dx1 < d2, then there are only three possibilities.)


In the SGPLOT routine, use the GROUPLC= option to set the discrete colors of the line segments. This will work better than using COLORRESPONSE, which assumes a continuous response variable.


Here's some code to get you started:


data Want;
set have;
passed1 = ^cmiss(dx1) & visit_date>= dx1;
passed2 = ^cmiss(dx2) & visit_date>= dx2;
color = 0;
if passed1 & passed2 then Color=3;
else if passed2 then Color=2;
else if passed1 then Color=1;

proc sgplot data=Want noborder subpixel;
    series x=age_measure y=measure / group=patientid grouplc=Color lineattrs=(thickness=2);
    xaxis grid;
    yaxis grid;


Fluorite | Level 6
Dear Rick,
Thank you so much for your suggestion !
I probably didn't make it clear in my description of the problem, but its the transition from none to dx1 to dx2 that I need to get a graphical overview of to get some kind of idea how many persons have measurements before and after diagnosis and how their trend are when they change diagnosis.

When I run your code, the persons have the same color for the whole trend, is it possible to get it to change color, when it passes the date of diagnosis ?

You are completely right, that dx2 > dx1, but sometimes the person jumps right to dx2. The relation between age and the dates is because I work with gestational age in pregnancy - so, I can translate the dx1 and dx2 dates into the corresponding gestational age (age measure). I hope it makes sense.

Sorry, I was wrong and did not test my code. The GROUPLC= option colors and ENTIRE curve, whereas you want to color segments within a curve.


You clearly copied your code from this blog post, and it would have saved us both time if you had referenced that fact. Follow the method in the blog post. Pay attention to the warning to "make sure the last point of the previous curve segment is replicated as the first point of the next segment." That will require some additional DATA step code. If you can't figure it out, then someone on the forum can help you. 


Good luck!



Fluorite | Level 6
I was handed the code from one of my colleagues, I wasn't aware of that blog post.
I still have no idea how to approach it, how do I contact one from that forum ?
Meteorite | Level 14

This ("graphics programming" on is "that forum". In general, this is a good place to post questions. And the more specific/concise the question, the easier if is for the community to pitch in and answer it. And sometimes preparing the dataset to plot is the most challenging part of the problem (I suspect that might be a factor with your question, although I haven't had a chance to study it in depth). 🙂


I noticed you posted a comment on the blog that Rick mentioned. I just thought I'd mention that the author of that blog post (Sanjay) has retired since writing the post, therefore there might not be anyone looking closely at your question there.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 3 in conversation