Hi All:
I am trying to create a graph such that there is a series of boxplots where each plot represents the distributions of scores on a given "test". I also want to add a line over this graph to show how a group/person scored across this graph. I have attached a picture of what I am trying to do (I drew the line freehand in paint :)). How do I combine two graphs using the same data? I know it has something to do with using the overlay option but I am not sure how it works. Does anyone have a simple set of code they could share on how to have both a line and boxplots in the same graph. Thanks!
Take care,
Jeanine
What version of SAS are you using?
9.4 TS Level 1M3
X64_8Pro Platform
Overlay a SERIES plot on your VBOX to draw the line. Something like this:
proc sgplot data=whatever;
vbox newy / category=group;
series x=group y=newy;
run;
Hope this helps!
Dan
Thanks but this is what I get when I try that code:
proc sgplot data = DO;
vbox PE_Value/ category= group;
series x=group y =PE_Value;
run;
Should I use Gplot instead?
I found something on the SAS webpage http://support.sas.com/kb/46/719.html
but I don't know how to add the line and I can't get the labels to work:
data anno;
set all;
by group;
length function color text $8;
retain color 'black' when 'a' position '6' size 1.5;
if first.group then do;
function='move'; xsys='2'; ysys='2';
x=group; y=mean; output;
function='cntl2txt'; output;
function='label'; xsys='A'; x=+3;
text=trim(left(put(mean,8.01))); output;
function='move'; xsys='2'; ysys='2';
x=group; y=median; output;
function='cntl2txt'; output;
function='label'; xsys='A'; x=+3;
text=trim(left(put(median,8.01))); output;
end;
run;
/* Reshape the data set to create a category */
/* variable to use to generate the legend */
data reshape;
set all;
length zvar $8;
newy=PE_Value; zvar='Box'; output;
newy=mean; zvar='Mean'; output;
newy=median; zvar='Median'; output;
run;
/* Define a title for the graph */
title1 "Display Mean and Median Values on a Box Plot";
/* Define SYMBOL characteristics */
symbol1 interpol=boxft cv=LightSeaGreen co=black value=none bwidth=9 mode=include;
symbol2 interpol=none color=depk height=1.8 value=dot;
symbol3 interpol=none color=mob height=1.8 value=diamondfilled;
/* Define legend characteristics */
legend1 order=('Mean' 'Median') repeat=1
label=none frame shape=symbol(1.8,1.8);
/* Create the graph */
axis2 minor=none offset=(5pct,5pct);
proc sort data = reshape;
by name;
proc gplot data=reshape;
plot newy*group=zvar / legend=legend1 overlay
haxis=axis2
annotate=anno;
run;
quit;
The code I gave you assumed that the data for "Your Score" was one value per test. The SERIES plot should not have the same variables as the data used to compute the boxes (that's why you have the scrambled lines). Do you have separate columns (maybe in another dataset) that contain the "your score" values? If so, you can merge the two data sets into one data set and use the code I gave you, changing the SERIES variables to the correct variables.
You may find @Jay54's blog post provides some tips... http://blogs.sas.com/content/graphicallyspeaking/2016/01/20/easy-box-plot-with-connect/
Thanks guys!
I was able get this graph by using this code:
data DO;
title1 "Comparision of Your Score for Each Test ";
set table_fortest;
Your_Score=PE_Value;
if id = 11 and per = 1 then person = 'test 1';
if id = 11 and per = 2 then person = 'test 2';
if id = 11 and per = 3 then person = 'test 3';
if id = 11 and per = 4 then person = 'test 4';
if id = 11 and per = 5 then person = 'test 5';
proc sort data = DO;
by PERSON;
proc sgplot data = DO;
vbox PE_Value/ category= group;
series x=PERSON y =your_score;
run;
What I want to do is run this through every id so I can export a graph for each person/group without having to do it one at a time.
I think I need to think about how I have my data structured.
So the data looks like this (actually these are made up ids and values but the structure is the same):
id | per | PE_value | group | Your_Score | person |
11 | 1 | 0.73 | test 1 | 0.73 | test 1 |
11 | 2 | 0.61 | test 2 | 0.61 | test 2 |
11 | 3 | 0.61 | test 3 | 0.61 | test 3 |
11 | 4 | 0.6 | test 4 | 0.6 | test 4 |
11 | 5 | 0.71 | test 5 | 0.71 | test 5 |
12 | 1 | 0.71 | test 1 | 0.71 | |
12 | 2 | 0.68 | test 2 | 0.68 | |
12 | 3 | 0.56 | test 3 | 0.56 | |
12 | 4 | 0.5 | test 4 | 0.5 | |
12 | 5 | 0.5 | test 5 | 0.5 | |
13 | 1 | .7 | test 1 | .7 |
and so on.
I think I need to play around with putting this in a do loop and restructuring the data but thanks again for the code!
If/when I figure out the rest, I'll post it for anyone else who is trying to do this.
Thanks again!
J9
Thanks guys!
I was able get this graph by using this code:
data DO;
title1 "Comparision of Your Score for Each Test ";
set table_fortest;
Your_Score=PE_Value;
if id = 11 and per = 1 then person = 'test 1';
if id = 11 and per = 2 then person = 'test 2';
if id = 11 and per = 3 then person = 'test 3';
if id = 11 and per = 4 then person = 'test 4';
if id = 11 and per = 5 then person = 'test 5';
proc sort data = DO;
by PERSON;
proc sgplot data = DO;
vbox PE_Value/ category= group;
series x=PERSON y =your_score;
run;
What I want to do is run this through every id so I can export a graph for each person/group without having to do it one at a time.
I think I need to think about how I have my data structured.
So the data is looks like this (actually these are made up ids and values but the structure is the same):
id | per | PE_value | group | Your_Score | person |
11 | 1 | 0.73 | test 1 | 0.73 | test 1 |
11 | 2 | 0.61 | test 2 | 0.61 | test 2 |
11 | 3 | 0.61 | test 3 | 0.61 | test 3 |
11 | 4 | 0.6 | test 4 | 0.6 | test 4 |
11 | 5 | 0.71 | test 5 | 0.71 | test 5 |
12 | 1 | 0.71 | test 1 | 0.71 | |
12 | 2 | 0.68 | test 2 | 0.68 | |
12 | 3 | 0.56 | test 3 | 0.56 | |
12 | 4 | 0.5 | test 4 | 0.5 | |
12 | 5 | 0.5 | test 5 | 0.5 | |
13 | 1 | .7 | test 1 | .7 |
and so on.
I think I need to play around with putting this in a do loop and restructuring the data but thanks again for the code!
If/when I figure out the rest, I'll post it for anyone else who is trying to do this.
Thanks again!
J9
What I want to do is run this through every id so I can export a graph for each person/group without having to do it one at a time
Add
BY Person;
to the SGPLOT code.
Nope, thanks but that doesn't work. If I do that then produces a seperate graph for each boxplot because the "person" variable is Test 1, Test 2, Test 3, Test 4, Test 5 for each id.
I need my code to work such that it will aggregate the data to create the boxplot and still be able to graph a separate line for each person.
Again, I think I need to re-think how my data is structured. Will play with it this weekend.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.