I have data on three participants, that have roughly the same time-points for measurements of lactate (0,3,6,9,12,15,18,21,24). I currently have a line graph displaying each participants values, but want to add another "participant" that is the average of the whole group so it shows up as another line in the graph.
Here is how the data looks.
Name | Measurement | Value | Time_Point | Pre_Post | Group |
A | Lactate | -1 | Pre | 1 | |
A | Lactate | 10.99 | 3 | Post | 1 |
A | Lactate | 4.34 | 5.5 | Post | 1 |
A | Lactate | 6.11 | 9 | Post | 1 |
A | Lactate | 1.15 | 12 | Post | 1 |
A | Lactate | 16 | Post | 1 | |
A | Lactate | 21 | Post | 1 | |
A | Lactate | 1.42 | 24 | Post | 1 |
B | Lactate | -1 | Pre | 1 | |
B | Lactate | 11.86 | 3 | Post | 1 |
B | Lactate | 11.45 | 6 | Post | 1 |
B | Lactate | 7.16 | 9 | Post | 1 |
B | Lactate | 3 | 12 | Post | 1 |
C | Lactate | -1 | Pre | 1 | |
C | Lactate | 10.99 | 3 | Post | 1 |
C | Lactate | 14.21 | 6.5 | Post | 1 |
C | Lactate | 6.47 | 9.5 | Post | 1 |
C | Lactate | 3.87 | 13 | Post | 1 |
C | Lactate | 3.08 | 15 | Post | 1 |
C | Lactate | 2.92 | 17.5 | Post | 1 |
C | Lactate | 5.79 | 20.5 | Post | 1 |
C | Lactate | 2.72 | 24 | Post | 1 |
I currently have this code to make the dataset above:
Data Name.group1_V_Lactate;
set work.venous_long;
where Group= 1 and Measurement= 'Lactate';
if time_point <0 then time_point=-1;
If 0 <= time_point <=3.5 then time_point=3;
else if 3< time_point =<6.5 then time_point=6;
else if 6< time_point =<9.5 then time_point=9;
else if 9< time_point =<12.5 then time_point=12;
else if 12< time_point =<15.5 then time_point=15;
else if 15< time_point =<18.5 then time_point=18;
else if 18< time_point =<21.5 then time_point=21;
else if 21< time_point =<24.5 then time_point=24;
run;
And am using this code to produce the series plot, but want to add a fourth name that is the average of the three participants to show up in the graph. What is the best way to do this, i assume it would be in the data step to use means(of ...) and make a new variable, but not sure how to do this for each time point.
proc sgplot data=name.group1_V_Lactate;
Title "Group 1 Venous Lactate levels";
series x=time_point y=Value / group=name markers;
refline 0 / axis=x label= "Surgery";
xaxis values=(-2 -1 0 1 3 6 9 12 15 18 21 24) grid;
xaxis label = "Time of Collection Since Surgery (hours)";
yaxis min=0 max=15 values=(0,0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5,6.0,6.5,7.0,7.5,8.0,8.5,9.0,9.5,
10.0,10.5,11.0,11.5,12.0,12.5,13.0,13.5,14.0,14.5,15.0) grid;
yaxis label = "mmol/L";
run;
I don't think the MEAN() function works here, it only works if all the data values are in the same row. But that is not the case. You probably want to use PROC SUMMARY to compute the means, then add them back into the data set, something like this (UNTESTED CODE):
proc summary data=have nway;
class time_point;
var value;
output out=_stats_ mean=;
run;
data for_plot;
length name $ 4;
set have _stats_(in=in2);
if in2 then name='Mean';
run;
Then you should be able to run your PROC SGPLOT on data set named FOR_PLOT.
If you want tested code, please provide data as SAS data step code which you can type in yourself or follow these instructions, and not as screen captures.
You need to reconsider what this is supposed to do. Your range of values overlaps. Is 3.2 for example really supposed to 3 or 6? 3.2 is larger than 3. so could be the second result. Your ranges should really, to be proper, not overlap so there is no question as to intent.
If 0 <= time_point <=3.5 then time_point=3; else if 3< time_point =<6.5 then time_point=6; else if 6< time_point =<9.5 then time_point=9; else if 9< time_point =<12.5 then time_point=12; else if 12< time_point =<15.5 then time_point=15; else if 15< time_point =<18.5 then time_point=18; else if 18< time_point =<21.5 then time_point=21; else if 21< time_point =<24.5 then time_point=24;
Here is my take (after adjusting all of your lower end of the time points to match the previous upper bound)
data have; input Name $ Measurement $ Value Time_Point Pre_Post $ Group; datalines; A Lactate . -1 Pre 1 A Lactate 10.99 3 Post 1 A Lactate 4.34 5.5 Post 1 A Lactate 6.11 9 Post 1 A Lactate 1.15 12 Post 1 A Lactate . 16 Post 1 A Lactate . 21 Post 1 A Lactate 1.42 24 Post 1 B Lactate . -1 Pre 1 B Lactate 11.86 3 Post 1 B Lactate 11.45 6 Post 1 B Lactate 7.16 9 Post 1 B Lactate 3 12 Post 1 C Lactate . -1 Pre 1 C Lactate 10.99 3 Post 1 C Lactate 14.21 6.5 Post 1 C Lactate 6.47 9.5 Post 1 C Lactate 3.87 13 Post 1 C Lactate 3.08 15 Post 1 C Lactate 2.92 17.5 Post 1 C Lactate 5.79 20.5 Post 1 C Lactate 2.72 24 Post 1 ; Proc format; value timepoint low - 0= '-1' 0 <- 3.5='3' 3.5<-6.5 ='6' 6.5<-9.5 ='9' 9.5<-12.5 ='12' 12.5<-15.5='15' 15.5<-18.5='18' 18.5<-21.5='21' 21.5<-24.5='24' 24.5 - high= '27' ; run; proc summary data=have ; class name time_point; format time_point timepoint. ; var value; output out=summary mean=; run; data toplot; length name $ 8; set summary ; where _type_ in (1 3); if _type_=1 then Name='Average'; run; proc sgplot data=toplot; series x=time_point y=Value / group=name markers; refline 0 / axis=x label= "Surgery"; xaxis values=(-2 -1 0 1 3 6 9 12 15 18 21 24) grid; xaxis label = "Time of Collection Since Surgery (hours)"; yaxis min=0 max=15 values=(0,0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5,6.0,6.5,7.0,7.5,8.0,8.5,9.0,9.5, 10.0,10.5,11.0,11.5,12.0,12.5,13.0,13.5,14.0,14.5,15.0) grid; yaxis label = "mmol/L"; run;
The groups created by formats will work for analysis, reporting and almost any graphing task and quite often the code is shorter than a bunch of if/then/else statements. Since you want to have an average of values you can use Proc Summary which with Class variables will create summaries of all the combinations of the class variables and then filter as desired on the _type_ variable to get the overall per time_point plus the name/time_point combinations. If you had other measurements and/or groups you could add those to the CLASS statement in Proc Summary such as "by measure group name;" instead of filtering the data. You would want to examine the resulting summary data set to select the desired values of the _TYPE_ variable to insure you get the desired plot results and to set the logic to create a desired name when missing. Then you could use BY Measure Group; in the SGPLOT to create a graph for each combination of Measure and Group. There might need to be some sorting though typically the Class statement will create output in sorted order.
Note that if you wanted to see the affect of different ranges of the time point you only need to change the definition, or add a new format definition, and rerun the proc summary which can avoid problems with logic and reassigning variable values.
This is working well so far, but I have groups where there are four participants and many of the plots axis are not properly displaying.
Currently have the below code
Data sheep.group3_V_Lactate;
set work.venous_long;
where Group= 3 and Measurement= 'Lactate';
run;
proc summary data=sheep.group3_V_Lactate;
class Sheep_name time_point;
format time_point time_point_FMT.;
var value;
output out=work.group3_V_Lactatesum mean=;
run;
Data Sheep.group3_V_Lactate2;
length Sheep_name $ 8;
set work.group3_V_Lactatesum;
where _type_ in(1,3);
if _type_ = 1 then Sheep_name='Average';
run;
proc sgplot data=sheep.group3_V_Lactate2;
Title "Group 3 Venous Lactate levels";
Series x=time_point y=Value / group=Sheep_name markers;
refline 0 / axis=x label= "Surgery";
xaxis min=-1 max=24 minor grid values= (-1 0 1 3 6 9 12 15 18 21 24);
xaxis label = "Time of Collection Since Surgrey (hours)";
yaxis values=(0,0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5,6.0,6.5,7.0,7.5,8.0,8.5,9.0,9.5,
10.0,10.5,11.0,11.5,12.0,12.5,13.0,13.5,14.0,14.5,15.0) grid;
yaxis label = "mmol/L";
run;
Producing the attached graph:
Any thoughts on trouble shooting the graph?
Show the LOG from running your SGPLOT code. Copy the code plus all the messages from the log, open a text box on the forum with the </> and paste the text.
Do not use two Xaxis or Yaxis statements. The first one will get overwritten by the second. So add the labels to the one with the tick values. That should fix the axis appearance.
The example data you showed us used Group=1 data. If your Time_points and/or value ranges are different in group=3 the data might require a different format for the Time_point.
Run this code and see what the frequencies look like:
proc freq data=sheep.group3_V_Lactate2; tables time_point*value/missing list; format time_point time_point_FMT.; run;
If all of the Tim_point show as -1 then you need to show the LOG of the Proc format code that you ran to create the time_point_fmt.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.