Dear all,
I am working with some longitudinal data, where I am making a spaghetti-plot for each patient.
I need to "mark" the occurrence of some dates in the plot, and I have thought about doing that with a dummy-variable and use the colourresponse option, but I can't get it to do it right.
So, below is a test-dataset containing patientid, visit_date, measurement (measure), age at measure (age_measure), date of diagnosis 1 (dx1), date of diagnosis 2 (dx2). The desired output is a spaghetti-plot with patient id as group, and where the color changes when the visitdate passes the diagnosis date.
In the code below I have only made the dummy variable contain diagnosis 1, it would be a great help to show me, how to incorporate the date of diagnosis 2 also.
if it helps, I can change the dates to corresponding age_measure values.
Thanks 🙂
data patients;
input patientid $ (dx1 dx2) (:yymmdd8.);
format dx1 dx2 yymmdd10.;
datalines;
1 20180101 20180110
2 20170205 .
3 20170221 20170225
4 20180101 20180202
5 . 20180503
;
data visits;
input patientid $ visit_date :yymmdd8. measure age_measure;
format visit_date yymmdd10.;
datalines;
1 20180101 3.4 10
1 20180505 2.3 15
2 20170210 7.3 20
2 20170217 7.2 25
2 20170220 7.1 30
3 20170221 5.4 35
4 20180202 3.4 23
4 20180204 3.2 25
5 20180504 5.6 30
5 20180505 5.0 32
;
data have;
merge
patients
visits
;
by patientid;
run;
/* The plot */
data have;
set have;
If .<dx1 ge age_measure
then dummy_var = 2;
else
If .<dx1 LT age_measure then dummy_var=1;
else dummy_var=.;
run;
proc sgplot data=have noborder subpixel;
series x=age_measure y=measure / group=patientid colorresponse=dummy_var colormodel=(red gold green) lineattrs=(thickness=2);
xaxis display=(noline noticks nolabel) grid;
yaxis display=(noline noticks nolabel) grid;
run;
For an overview of creating and coloring groups in spaghetti plots, see "Create spaghetti plots in SAS".
I don't understand how the age and date variables are related, but my advice is to create binary variables 'passed1' and 'passed2' to indicate whether the patient has passed the dx1 and dx2 dates. You can then assign value 0-3 to a grouping variable ('Color') according to whether the patient has passed none, dx1, dx2, or both dates. (Are dx1 and dx2 independent, or is dx1 < dx2? If dx1 < d2, then there are only three possibilities.)
In the SGPLOT routine, use the GROUPLC= option to set the discrete colors of the line segments. This will work better than using COLORRESPONSE, which assumes a continuous response variable.
Here's some code to get you started:
data Want;
set have;
passed1 = ^cmiss(dx1) & visit_date>= dx1;
passed2 = ^cmiss(dx2) & visit_date>= dx2;
color = 0;
if passed1 & passed2 then Color=3;
else if passed2 then Color=2;
else if passed1 then Color=1;
run;
proc sgplot data=Want noborder subpixel;
series x=age_measure y=measure / group=patientid grouplc=Color lineattrs=(thickness=2);
xaxis grid;
yaxis grid;
run;
Sorry, I was wrong and did not test my code. The GROUPLC= option colors and ENTIRE curve, whereas you want to color segments within a curve.
You clearly copied your code from this blog post, and it would have saved us both time if you had referenced that fact. Follow the method in the blog post. Pay attention to the warning to "make sure the last point of the previous curve segment is replicated as the first point of the next segment." That will require some additional DATA step code. If you can't figure it out, then someone on the forum can help you.
Good luck!
This ("graphics programming" on communities.sas.com) is "that forum". In general, this is a good place to post questions. And the more specific/concise the question, the easier if is for the community to pitch in and answer it. And sometimes preparing the dataset to plot is the most challenging part of the problem (I suspect that might be a factor with your question, although I haven't had a chance to study it in depth). 🙂
I noticed you posted a comment on the blog that Rick mentioned. I just thought I'd mention that the author of that blog post (Sanjay) has retired since writing the post, therefore there might not be anyone looking closely at your question there.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.