Data visualization with SAS programming

Can anyone reproduce this cdc chart for opioid overdose by state?

Accepted Solution Solved
Reply
Highlighted
Occasional Contributor
Posts: 6
Accepted Solution

Can anyone reproduce this cdc chart for opioid overdose by state?

I am trying to reproduce (with my own data) the chart in the CDC report (Rudd, RA et al 2016 ) below, showing change in opioid overdose rate by state. This is not the way I would have chosen to display the data, but the authors did. So In the name of replication that’s what I am going to do too. I’ve never seen change in rate displayed like this so I don’t even know where to begin with google/forum searches.

Image: https://goo.gl/images/fCm1sm
Article: https://www.cdc.gov/mmwr/volumes/65/wr/mm655051e1.htm

The main features of this would be y axis with states, x axis with rates, two obs (rate10, rate15) displayed per y value (state), both obs are connected with a line, year is the group var, color coded so that 2010 rate= open round cap , 2015 rate= Filled round cap.

I’m not totally new to creating figures in SAS but I’m not a pro so It may help me understand an answer to have fake data so here it goes

Data staterates;
Input state $ rate year;
Format state $2. Rate 4.2 year 4.0;
datalines;
AK 56.34 2010
AK 36.56 2015
IL 21.47 2010
IL 48.21 2015
CA 14.60 2010
CA 18.6 2015
;
Run;

Thank you!

Accepted Solutions
Solution
4 weeks ago
SAS Super FREQ
Posts: 1,199

Re: Can anyone reproduce this cdc chart for opioid overdose by state?

Using Warren's code as example, you could add XERRORLOWER=deaths2010 to the first scatter plot as follows.  It does not matter whether the value of deaths2010 is lower of higher than deaths2015.  

 

proc sgplot nocycleattrs noautolegend;
   title;
   scatter y=state x=deaths2015 / markerattrs=GraphDataDefault(symbol=circlefilled)
                                  name='a' legendlabel='2015' xerrorlower=deaths2010;
   scatter y=state x=deaths2010 / markerattrs=(symbol=circlefilled)
                                  filledoutlinedmarkers name='b' legendlabel='2010'
                                  markerfillattrs=(color=white)
                                  markeroutlineattrs=GraphDataDefault;
   xaxis label='Deaths per 100,000 population';
   keylegend 'a' 'b'/ location=inside across=1 position=bottomright noborder;

 

For better control of the colors, you can use 3 separate SCATTER plots to display the line and the two markers.   Use a SCATTER plot with the xErrorLower and xErrorUpper values with the two variables, and set the markerattrs(size=0) to prevent display of any marker.  Then, use the other two SCATTER plots to display the 2010 and 2015 values with the right colors.

View solution in original post


All Replies
SAS Super FREQ
Posts: 442

Re: Can anyone reproduce this cdc chart for opioid overdose by state?

PROC SGPLOT could easily do this.  If you provided the full data set, someone could show you.  I might take a stab at it with some fake data.

SAS Super FREQ
Posts: 442

Re: Can anyone reproduce this cdc chart for opioid overdose by state?

[ Edited ]
Posted in reply to WarrenKuhfeld

Had problems editing my response, so I reposted.

SAS Super FREQ
Posts: 442

Re: Can anyone reproduce this cdc chart for opioid overdose by state?

Posted in reply to WarrenKuhfeld
data states;
   deaths2010 = 25 * uniform(151);
   deaths2015 = deaths2010 + 2 + normal(151);
   input State $1-15;
   datalines;
Alabama        
Alaska         
Arizona        
Arkansas       
California     
Colorado       
Connecticut    
Delaware       
Florida        
Georgia        
Hawaii         
Idaho          
Illinois       
Indiana        
Iowa           
Kansas         
Kentucky       
Louisiana      
Maine          
Maryland       
Massachusetts  
Michigan       
Minnesota      
Mississippi    
Missouri       
Montana        
Nebraska       
Nevada         
New Hampshire  
New Jersey     
New Mexico     
New York       
North Carolina 
North Dakota   
Ohio           
Oklahoma       
Oregon         
Pennsylvania   
Rhode Island   
South Carolina 
South Dakota   
Tennessee      
Texas          
Utah           
Vermont        
Virginia       
Washington     
West Virginia  
Wisconsin      
Wyoming        
;

proc sort;
   by deaths2015;
run;

ods html body='b.html' style=htmlblue image_dpi=300;
ods graphics on / height=12in width=5in;
proc sgplot nocycleattrs noautolegend;
   title;
   highlow y=state low=deaths2010 high=deaths2015;   
   scatter y=state x=deaths2015 / markerattrs=GraphDataDefault(symbol=circlefilled)
                                  name='a' legendlabel='2015';
   scatter y=state x=deaths2010 / markerattrs=(symbol=circlefilled)
                                  filledoutlinedmarkers name='b' legendlabel='2010'
                                  markerfillattrs=(color=white)
                                  markeroutlineattrs=GraphDataDefault;
   xaxis label='Deaths per 100,000 population';
   keylegend 'a' 'b'/ location=inside across=1 position=bottomright noborder;
quit;
ods html close;
SAS Super FREQ
Posts: 442

Re: Can anyone reproduce this cdc chart for opioid overdose by state?

Posted in reply to WarrenKuhfeld

Here is a revised solution with a change to the artificial data to more clearly show a few points where the rate has gone down.  This one plots high/low variables instead of the original.

data states;
   deaths2010 = 25 * uniform(151);
   deaths2015 = deaths2010 + 5 * (uniform(151) > 0.2) + normal(151);
   low  = min(deaths2010, deaths2015);
   high = max(deaths2010, deaths2015);
   input State $1-15;
   datalines;
Alabama        
Alaska         
Arizona        
Arkansas       
California     
Colorado       
Connecticut    
Delaware       
Florida        
Georgia        
Hawaii         
Idaho          
Illinois       
Indiana        
Iowa           
Kansas         
Kentucky       
Louisiana      
Maine          
Maryland       
Massachusetts  
Michigan       
Minnesota      
Mississippi    
Missouri       
Montana        
Nebraska       
Nevada         
New Hampshire  
New Jersey     
New Mexico     
New York       
North Carolina 
North Dakota   
Ohio           
Oklahoma       
Oregon         
Pennsylvania   
Rhode Island   
South Carolina 
South Dakota   
Tennessee      
Texas          
Utah           
Vermont        
Virginia       
Washington     
West Virginia  
Wisconsin      
Wyoming        
;

proc sort; by deaths2015; run;

ods html body='b.html' style=htmlblue image_dpi=300;
ods graphics on / height=12in width=5in;
proc sgplot nocycleattrs noautolegend;
   title;
   highlow y=state low=low high=high;   
   scatter y=state x=deaths2015 / markerattrs=GraphDataDefault(symbol=circlefilled)
                                  name='a' legendlabel='2015';
   scatter y=state x=deaths2010 / markerattrs=(symbol=circlefilled)
                                  filledoutlinedmarkers name='b' legendlabel='2010'
                                  markerfillattrs=(color=white)
                                  markeroutlineattrs=GraphDataDefault;
   xaxis label='Deaths per 100,000 population';
   keylegend 'a' 'b'/ location=inside across=1 position=bottomright noborder;
quit;
ods html close;
Occasional Contributor
Posts: 6

Re: Can anyone reproduce this cdc chart for opioid overdose by state?

Thank you for your response. My hunch is also that this is something SAS could do easily. And your response is so great, but unfortunately not quite what I need. I should have mentioned that I turned to the forum AFTER trying a hilow plot without success. The problem is that opioid rates did not increase in every state. some 2010 points are higher than 2015. In the cases where 2010>2015 there is no line drawn between the end caps. Is there any way to fix this? Note if you use my 3 obs of fake data it might be helpful as 1 case has 2010>2015. Thanks so much for taking a stab at this.
SAS Super FREQ
Posts: 1,199

Re: Can anyone reproduce this cdc chart for opioid overdose by state?

This is a known issue with HIGHLOW.  High must be > Low.  But, you can work around this by creating two new variables, "High" and "Low", and put the lower value in "Low" and higher value in "High".  Then, use these variables in the graph.  An alternative is to not use HIGHLOW, but use the SCATTER plots own xerror bars to draw the line.

Occasional Contributor
Posts: 6

Re: Can anyone reproduce this cdc chart for opioid overdose by state?

Posted in reply to Sanjay_SAS

The problem with creating variables as you’ve described above (reverse coding any lower 2015 obs as the lowvar and any higher 2010 obs as the highobs ) is hat I don’t know how to do this without losing the color coding specific to 2010 and 2015. 2015 should be dark blue regardless of where it sits on the chart. If you have a suggestion for this or can share some specific code that accomplishes the line drawing via xerror in scatter I’d be very grateful!

Solution
4 weeks ago
SAS Super FREQ
Posts: 1,199

Re: Can anyone reproduce this cdc chart for opioid overdose by state?

Using Warren's code as example, you could add XERRORLOWER=deaths2010 to the first scatter plot as follows.  It does not matter whether the value of deaths2010 is lower of higher than deaths2015.  

 

proc sgplot nocycleattrs noautolegend;
   title;
   scatter y=state x=deaths2015 / markerattrs=GraphDataDefault(symbol=circlefilled)
                                  name='a' legendlabel='2015' xerrorlower=deaths2010;
   scatter y=state x=deaths2010 / markerattrs=(symbol=circlefilled)
                                  filledoutlinedmarkers name='b' legendlabel='2010'
                                  markerfillattrs=(color=white)
                                  markeroutlineattrs=GraphDataDefault;
   xaxis label='Deaths per 100,000 population';
   keylegend 'a' 'b'/ location=inside across=1 position=bottomright noborder;

 

For better control of the colors, you can use 3 separate SCATTER plots to display the line and the two markers.   Use a SCATTER plot with the xErrorLower and xErrorUpper values with the two variables, and set the markerattrs(size=0) to prevent display of any marker.  Then, use the other two SCATTER plots to display the 2010 and 2015 values with the right colors.

SAS Super FREQ
Posts: 1,199

Re: Can anyone reproduce this cdc chart for opioid overdose by state?

Posted in reply to Sanjay_SAS

Here is sample with 3 scatter plots.

 

proc sgplot data=sashelp.cars(where=(type='Hybrid'));
  scatter y=model x=mpg_city / xerrorlower=mpg_city xerrorupper=mpg_highway markerattrs=(size=0);
  scatter y=model x=mpg_city / markerattrs=(symbol=circlefilled size=10 color=blue) legendlabel='City';
  scatter y=model x=mpg_highway / markerattrs=(symbol=circlefilled size=10 color=red) legendlabel='Highway';
run;

Note, values for "Prius" are reversed.

Occasional Contributor
Posts: 6

Re: Can anyone reproduce this cdc chart for opioid overdose by state?

Posted in reply to Sanjay_SAS

Thank you,  my friend! This worked excellently. I will try with the 3 scatter plots as well for better color options. 

Super User
Posts: 11,779

Re: Can anyone reproduce this cdc chart for opioid overdose by state?


Trp1 wrote:
Thank you for your response. My hunch is also that this is something SAS could do easily. And your response is so great, but unfortunately not quite what I need. I should have mentioned that I turned to the forum AFTER trying a hilow plot without success. The problem is that opioid rates did not increase in every state. some 2010 points are higher than 2015. In the cases where 2010>2015 there is no line drawn between the end caps. Is there any way to fix this? Note if you use my 3 obs of fake data it might be helpful as 1 case has 2010>2015. Thanks so much for taking a stab at this.

Are you using different data than the example graph? Note that in the example you post the link to there are a couple of points that have the 2010 > 2015 but the difference is so small that the points overlap and so you do not know if there was no line drawn between the two.

 

Is the purpose of the graph to show the connection between the years or the increase/decrease between the years? I find the lack of a line between the two quickly notice there is something different (if the majority show an increase) for those pairings.

 

 

Occasional Contributor
Posts: 6

Re: Can anyone reproduce this cdc chart for opioid overdose by state?

Yes I’m using different data than in the graph and I definitely have obs where the decrease is not so small that the points overlap. So this is a problem for me. I couldn’t even change the scale to get around it. 

 

I personally dont mind that the lines don’t show up for the obs where the change is negative. In fact I might prefer this without connecting lines at all. As I said, I would have displayed this data differently.  But the point is replication. So my goal is to get a chart that is formatted like cdc’s as closely as possible. 

 

Thanks so much for your thoughts. 

SAS Super FREQ
Posts: 442

Re: Can anyone reproduce this cdc chart for opioid overdose by state?

Sorry.  I posted it in a bit of a hurry.  @Sanjay_SAS told you the easy fix.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 215 views
  • 4 likes
  • 4 in conversation