Hi,
I'd like to create a graph as below. I have a data set including two variables. x axis will be one of the variables' percentiles and y axis will have the other one's percentiles. I did something in excel but I couldn't draw the arrows. Is there a way to draw that fancy graph?
Thanks
This gets you a bit closer, but not all the way there. For the final 'finish' you can create the labels needed for your graph using a format to map your values to a specific label. Not sure why it's not working here but it's pretty straightforward.
*create sample data;
data demo;
do i=1 to 100;
paper_scores = rand('normal', 40, 5);
online_scores = rand('normal', 60, 8);
output;
end;
run;
*get percentiles for graph;
ods select none;
proc means data=demo min p10 p20 p30 p40 p50 p60 p70 p80 p90 max stackods;
var paper_scores online_scores;
ods output summary = ss;
run;
ods select all;
*transpose to a different structure to support graphing;
proc transpose data=ss out=ss_long;
id variable;
run;
data ss_long;
set ss_long;
*clean up statistics names;
name = substr(_name_, 1, length(_name_) - 4);
*rename values;
if _label_ = 'Maximum' then percentile = 100;
else if _label_ = 'Minimum' then percentile = 0;
else percentile = input(compress(_label_, , 'kd'), 8.);
*calculate arrow start and end for each plot;
*You probably don't need this step but it just makes the next steps easier;
arrow_up_start_x = paper_scores;
arrow_up_start_y = 0;
arrow_up_end_x = paper_scores;
arrow_up_end_y = online_scores;
arrow_left_start_x = paper_scores;
arrow_left_start_y = online_scores;
arrow_left_end_x = 0;
arrow_left_end_y = online_scores;
run;
proc sort data=ss_long;
by percentile name;
run;
*create formats to customize labels;
data x_format;
set ss_long;
start = paper_scores;
label = percentile;
type = 'N';
fmtname = 'xaxis_fmt';
run;
data y_format;
set ss_long;
start = online_scores;
label = percentile;
type = 'N';
fmtname = 'yaxis_fmt';
run;
proc format cntlin=x_Format;
run;
proc format cntlin=y_Format;
run;
*create graph;
proc sgplot data=ss_long noautolegend;
spline x=paper_scores y = online_scores;
vector x=arrow_up_end_x y=arrow_up_end_y / xorigin = arrow_up_start_x yorigin = arrow_up_start_y;
vector x=arrow_left_end_x y=arrow_left_end_y / xorigin = arrow_left_start_x yorigin = arrow_left_start_y;
xaxis label = 'Paper Scores' valuesformat = xaxis_fmt. min=0;
yaxis label = 'Online Scores' valuesformat = yaxis_fmt. min=0;
run;
If you wanted just the lines, you could use DROPLINE in SGPLOT. For arrows, you will need VECTOR in SGPLOT. for the curve, use SPLINE.
data x;
set sashelp.class(firstobs=16);
x1=0;
run;
proc sgplot data=x;
scatter x=weight y=height/ markerattrs=(symbol=trianglefilled size=10 color=black);
dropline x=weight y=height/ dropto=x lineattrs=(color=black);
scatter x=x1 y=height/ markerattrs=(symbol=triangleleftfilled size=10 color=black);
dropline x=weight y=height/ dropto=y lineattrs=(color=black);
run;
Thank you @Ksharp ,
This code works but this is based on data. I'd like to draw the graph with percentiles (P5 P10 P20 P30 P40 P50). Is it possible to do that?
Hi @Ksharp again,
I calculated percentiles in excel and applied your code on the new file. Here is the graph. Do you think there is a way to connect the arrows. There is some space between arrows?
Thanks
Add one more statement,
Or try VECTOR .
Or try HIGHLOW .
proc sgplot data=x;
series x=weight y=height ;
scatter x=weight y=height/ markerattrs=(symbol=trianglefilled size=10 color=black);
dropline x=weight y=height/ dropto=x lineattrs=(color=black);
scatter x=x1 y=height/ markerattrs=(symbol=triangleleftfilled size=10 color=black);
dropline x=weight y=height/ dropto=y lineattrs=(color=black);
run;
Thank you so much @Ksharp . Here is how the new graph looks.. The arrows look good now, the only thing is x and y values' labels. As seen in the first post the y and x values are '5th', '10th' etc. but in my graph I have the exact values. If there is no way to change, then I will just provide this graph.
Using option 'values' of xaxis. to display 5th instead of 100 .
proc sgplot data=x;
scatter x=weight y=height/ markerattrs=(symbol=trianglefilled size=10 color=black);
dropline x=weight y=height/ dropto=x lineattrs=(color=black);
scatter x=x1 y=height/ markerattrs=(symbol=triangleleftfilled size=10 color=black);
dropline x=weight y=height/ dropto=y lineattrs=(color=black);
xaxis values=('5th' '10th' ............ ) ;
run;
An example (without the arrows) is shown in the article "Fit a distribution from quantiles," which includes the SAS code to create it.
As mentioned by others, you can use the VECTOR statement if arrows are essential.
This gets you a bit closer, but not all the way there. For the final 'finish' you can create the labels needed for your graph using a format to map your values to a specific label. Not sure why it's not working here but it's pretty straightforward.
*create sample data;
data demo;
do i=1 to 100;
paper_scores = rand('normal', 40, 5);
online_scores = rand('normal', 60, 8);
output;
end;
run;
*get percentiles for graph;
ods select none;
proc means data=demo min p10 p20 p30 p40 p50 p60 p70 p80 p90 max stackods;
var paper_scores online_scores;
ods output summary = ss;
run;
ods select all;
*transpose to a different structure to support graphing;
proc transpose data=ss out=ss_long;
id variable;
run;
data ss_long;
set ss_long;
*clean up statistics names;
name = substr(_name_, 1, length(_name_) - 4);
*rename values;
if _label_ = 'Maximum' then percentile = 100;
else if _label_ = 'Minimum' then percentile = 0;
else percentile = input(compress(_label_, , 'kd'), 8.);
*calculate arrow start and end for each plot;
*You probably don't need this step but it just makes the next steps easier;
arrow_up_start_x = paper_scores;
arrow_up_start_y = 0;
arrow_up_end_x = paper_scores;
arrow_up_end_y = online_scores;
arrow_left_start_x = paper_scores;
arrow_left_start_y = online_scores;
arrow_left_end_x = 0;
arrow_left_end_y = online_scores;
run;
proc sort data=ss_long;
by percentile name;
run;
*create formats to customize labels;
data x_format;
set ss_long;
start = paper_scores;
label = percentile;
type = 'N';
fmtname = 'xaxis_fmt';
run;
data y_format;
set ss_long;
start = online_scores;
label = percentile;
type = 'N';
fmtname = 'yaxis_fmt';
run;
proc format cntlin=x_Format;
run;
proc format cntlin=y_Format;
run;
*create graph;
proc sgplot data=ss_long noautolegend;
spline x=paper_scores y = online_scores;
vector x=arrow_up_end_x y=arrow_up_end_y / xorigin = arrow_up_start_x yorigin = arrow_up_start_y;
vector x=arrow_left_end_x y=arrow_left_end_y / xorigin = arrow_left_start_x yorigin = arrow_left_start_y;
xaxis label = 'Paper Scores' valuesformat = xaxis_fmt. min=0;
yaxis label = 'Online Scores' valuesformat = yaxis_fmt. min=0;
run;
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.