After reading Chuck Powell's interesting Slopegraphs and #R – A pleasant diversion, which graced the front page of R-bloggers.com recently, well, I just had to see if SAS ODS Graphics was also up to the challenge of creating a similar remix of Edward Tufte's famous cancer survival rate slopegraph. Grabbed some data from ET contest winner Pascal Schetelat's GitHub Python-based ET slopegraph project, wrote the short SAS program below, and voila! Was able to tap into ODS Graphics' "curvelabel" options here to auto-magically prevent label collisions. Btw, if you're unfamiliar with health study data (like me!), be sure to read the ET discussion group thread, which explains things like how survival rates can tick up over time.
OUTPUT
CODE
*==> A SAS ODS Graphics "Remix" of Edward Tufte's Famous Cancer Survival Slopegraph;
proc import datafile="/folders/myfolders/cancer_survival_rate.csv" dbms=csv out=rates replace; getnames=yes; guessingrows=max; /* Data is .csv */
proc sql; /* Use a little SQL to reshape the data from wide -> long for slopegraph */
create table rates2chart as
select cancer_type, 5 as year, _5_year as rate from rates union all
select cancer_type, 10 as year, _10_year as rate from rates union all
select cancer_type, 15 as year, _15_year as rate from rates union all
select cancer_type, 20 as year, _20_year as rate from rates;
ods graphics / width=11in height=17in antialias; /* Create 11"x17" slopegraph using SERIES (lines) and TEXT (rate values) plot statements */
proc sgplot data=rates2chart noautolegend noborder;
title height=12pt "Estimates of relative survival rates, by cancer site"; /* Repeat SERIES statement twice to get labels at beginning and end of series lines */
series x=year y=rate / group=cancer_type curvelabelpos=min curvelabel curvelabelloc=outside curvelabelattrs=(size=8pt color=black weight=bold) lineattrs=(pattern=solid thickness=1.5pt);
series x=year y=rate / group=cancer_type curvelabelpos=max curvelabel curvelabelloc=outside curvelabelattrs=(size=8pt color=black weight=bold) lineattrs=(pattern=solid thickness=1.5pt);
text x=year y=rate text=rate / strip textattrs=(size=8pt color=black weight=bold) backlight=0 backfill fillattrs=(color=white); /* Create whitespace surrounding rates */
text x=year y=rate text=rate / strip textattrs=(size=8pt color=black weight=bold); /* Repeat a second time without backfill to ensure rates aren't obscured by whitespace */
xaxis display=none type=discrete; yaxis display=none; /* Look Ma, no axes! */
refline 5 10 15 20 / axis=x label=('5 year' '10 year' '15 year' '20 year') labelattrs=(size=8pt color=black weight=bold) lineattrs=(thickness=0pt) labelloc=outside; /* Headings */
footnote height=8pt italic "Based on Edward Tufte discussion group thread at https://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0000Jr"; /* Credit (ideas) */
footnote2 height=8pt italic "Data sourced from https://github.com/pascal-schetelat/Slope"; /* Credit (data) */
DATA
Cancer type,5 year,10 year,15 year,20 year
Prostate,99,95,87,81
Thyroid,96,96,94,95
Testis,95,94,91,88
Melanomas,89,87,84,83
Breast,86,78,71,65
Hodgkin's disease,85,80,74,67
"Corpus uteri, uterus",84,83,81,79
"Urinary, bladder",82,76,70,68
"Cervix, uteri",71,64,63,60
Larynx,69,57,46,38
Rectum,63,55,52,49
"Kidney, renal pelvis",62,54,50,47
Colon,62,55,54,52
Non-Hodgkin's,58,46,38,34
"Oral cavity, pharynx",57,44,38,33
Ovary,55,49,50,50
Leukemia,43,32,30,26
"Brain, nervous system",32,29,28,26
Multiple myeloma,30,13,7,5
Stomach,24,19,19,15
Lung and bronchus,15,11,8,6
Esophagus,14,8,8,5
"Liver, bile duct",8,6,6,8
Pancreas,4,3,3,3
... View more