Dear all,
I am working with some longitudinal data, where I am making a spaghetti-plot for each patient.
I need to "mark" the occurrence of some dates in the plot, and I have thought about doing that with a dummy-variable and use the colourresponse option, but I can't get it to do it right.
So, below is a test-dataset containing patientid, visit_date, measurement, age at measure, date of diagnosis 1 (dx1), date of diagnosis 2 (dx2). The desired output is a spaghetti-plot with patient id as group, and where the color changes when the visitdate passes the diagnosis date.
In the code below I have only made the dummy variable contain diagnosis 1, it would be a great help to show me, how to incorporate the date of diagnosis 2 also.
Thanks 🙂
data patients;
input patientid $ (dx1 dx2) (:yymmdd8.);
format dx1 dx2 yymmdd10.;
datalines;
1 20180101 20180101
2 20170205 .
3 20170221 20170225
4 20180101 20180202
5 . 20180503
;
data visits;
input patientid $ visit_date :yymmdd8. measure age_measure;
format visit_date yymmdd10.;
datalines;
1 20180101 3.4 10
1 20180505 2.3 15
2 20170210 7.3 20
2 20170217 7.2 25
2 20170220 7.1 30
3 20170221 5.4 35
4 20180202 3.4 23
4 20180204 3.2 25
5 20180504 5.6 30
5 20180505 5.0 32
;
data have;
merge
patients
visits
;
by patientid;
run;
/* The plot */
data have;
set have;
If .<dx1 ge age_measurement
then dummy_var = 2;
else
If .<dx1 LT age_measurement then dummy_var=1;
else dummy_var=.;
run;
proc sgplot data=have noborder subpixel;
series x=age_measurement y=measurement / group=patientid colorresponse=dummy_var colormodel=(red gold green) lineattrs=(thickness=2);
xaxis display=(noline noticks nolabel) grid;
yaxis display=(noline noticks nolabel) grid;
run;
First issue is that your last data step uses a variable that doesn't exist:
data have;
set have;
If .<dx1 ge age_measurement /* the previously defined varaible is age_measure*/
then dummy_var = 2;
else
If .<dx1 LT age_measurement then dummy_var=1;
else dummy_var=.;
run;
If you fix the variable name and run
data have; set have; If .<dx1 ge age_measure then dummy_var = 2; else If .<dx1 LT age_measure then dummy_var=1; else dummy_var=.; run;
At least with the example data you only have 1 non-missing value for your dummy_var which isn't going to provide much for a colorresponse. And the colorresponse option means that group will be ignored.
You do know that " .<dx1" in the code is returning values of 0 and 1 don't you? So everything is "less than" your shown age_measure (or age_measurement after get the names straight) in the example.
Plus you have two variables in the series statement that do not exist in the example data.
Can you provide an image similar to what you expect to see?
Dear Ballardw,
something like this were red = before dx1, gold = after dx1 (before dx2), green = after dx2
Your patientid 1 has dx1 and dx2 the same date. So what should happen?
Are you sure that the x axis shouldn't be a date axis?
Or is "age_measure" supposed to mean "age at measurement"? If so then you need to convert the dates of dx1 and dx2 to age_measure equivalents (somehow) and compare them directly to the age_measure values.
Thanks for pointing that mistake out! Patients can't have dx1 and 2 on the same date, patient 2 has dx2 20181001.
Hmm yes, if it is possible. Age_measure is correctly age at measurement.
okay so, if I convert the dates of dx1 and dx2 to age_measure equivalents (I can do that).
What will my proc SGPLOT code then look like ?
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.