I have seen the following graph in this paper: https://doi.org/10.1093/ejcts/ezy167
It shows the distribution of propensity scores for two groups, highlighting the patients that have been matched using the score.
Imagine that I have a data set DATA with three variables:
How could I replicate this graph using PROC SGPLOT ?
This type of graph is sometimes called a butterfly plot.
Your example is slightly more complicated than the one in the blog post, "A butterfly plot for comparing distributions,"
but I think that article is a good place to start.
Google finds an answer quickly
https://blogs.sas.com/content/graphicallyspeaking/2013/11/21/comparative-histograms/
and other solutions are found via Google as well
Check Rick.Wicklin 's blog:
Overlay a curve on a histogram in SAS - The DO Loop
/*
That would be better if you post some real data.
*/
data have;
call streaminit(123);
do group='ON-PUMP','OFF-PUMP';
do id=1 to 200;
matched=rand('bern',0.6);
ps=rand('uniform');
group2=group;
output;
end;
end;
run;
ods select none;
proc sgplot data=have;
histogram ps/group=group scale=count nbins=20;
ods output sgplot=sgplot1;
run;
proc sgplot data=have(where=(matched=1));
histogram ps/group=group2 scale=count nbins=20;
ods output sgplot=sgplot2;
run;
ods select all;
data sgplot1;
set sgplot1;
if BIN_PS_GROUP_GROUP_SCALE_cou__GP='ON-PUMP' then
BIN_PS_GROUP_GROUP_SCALE_cou___Y=-BIN_PS_GROUP_GROUP_SCALE_cou___Y;
keep BIN_PS_GROUP_GROUP_SCALE_cou__GP BIN_PS_GROUP_GROUP_SCALE_cou___Y BIN_PS_GROUP_GROUP_SCALE_cou___X;
run;
data sgplot2;
set sgplot2;
if BIN_PS_GROUP_GROUP2_SCALE_co__GP='ON-PUMP' then
BIN_PS_GROUP_GROUP2_SCALE_co___Y=-BIN_PS_GROUP_GROUP2_SCALE_co___Y;
keep BIN_PS_GROUP_GROUP2_SCALE_co__GP BIN_PS_GROUP_GROUP2_SCALE_co___Y BIN_PS_GROUP_GROUP2_SCALE_co___X;
run;
data want;
merge sgplot1 sgplot2;
run;
proc format;
picture fmt
low-0='00009';
run;
proc sgplot data=want noautolegend;
format BIN_PS_GROUP_GROUP_SCALE_cou___Y fmt.;
styleattrs datacolors=( darkgreen lightgreen) AXISEXTENT=data;
/*styleattrs datacolors=( CX006837 CXC2E699) AXISEXTENT=data;*/
vbarparm category=BIN_PS_GROUP_GROUP_SCALE_cou___X
response=BIN_PS_GROUP_GROUP_SCALE_cou___Y /
barwidth=1 group=BIN_PS_GROUP_GROUP_SCALE_cou__GP outlineattrs=(color=grey) nofill;
inset 'OFF-PUMP'/textattrs=(color=lightgreen) position=top;
vbarparm category=BIN_PS_GROUP_GROUP2_SCALE_co___X
response=BIN_PS_GROUP_GROUP2_SCALE_co___Y /
barwidth=1 group=BIN_PS_GROUP_GROUP_SCALE_cou__GP outlineattrs=(color=grey);
inset 'ON-PUMP'/textattrs=(color=darkgreen) position=bottom;
xaxis values=(0 to 1 by 0.2) type=linear display=(nolabel) offsetmin=0.05 offsetmax=0.05;
yaxis label='n of subject' offsetmin=0.1 offsetmax=0.1;
run;
This type of graph is sometimes called a butterfly plot.
Your example is slightly more complicated than the one in the blog post, "A butterfly plot for comparing distributions,"
but I think that article is a good place to start.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.