Using Warren's code as example, you could add XERRORLOWER=deaths2010 to the first scatter plot as follows. It does not matter whether the value of deaths2010 is lower of higher than deaths2015.
proc sgplot nocycleattrs noautolegend;
title;
scatter y=state x=deaths2015 / markerattrs=GraphDataDefault(symbol=circlefilled)
name='a' legendlabel='2015' xerrorlower=deaths2010;
scatter y=state x=deaths2010 / markerattrs=(symbol=circlefilled)
filledoutlinedmarkers name='b' legendlabel='2010'
markerfillattrs=(color=white)
markeroutlineattrs=GraphDataDefault;
xaxis label='Deaths per 100,000 population';
keylegend 'a' 'b'/ location=inside across=1 position=bottomright noborder;
For better control of the colors, you can use 3 separate SCATTER plots to display the line and the two markers. Use a SCATTER plot with the xErrorLower and xErrorUpper values with the two variables, and set the markerattrs(size=0) to prevent display of any marker. Then, use the other two SCATTER plots to display the 2010 and 2015 values with the right colors.
PROC SGPLOT could easily do this. If you provided the full data set, someone could show you. I might take a stab at it with some fake data.
Had problems editing my response, so I reposted.
data states;
deaths2010 = 25 * uniform(151);
deaths2015 = deaths2010 + 2 + normal(151);
input State $1-15;
datalines;
Alabama
Alaska
Arizona
Arkansas
California
Colorado
Connecticut
Delaware
Florida
Georgia
Hawaii
Idaho
Illinois
Indiana
Iowa
Kansas
Kentucky
Louisiana
Maine
Maryland
Massachusetts
Michigan
Minnesota
Mississippi
Missouri
Montana
Nebraska
Nevada
New Hampshire
New Jersey
New Mexico
New York
North Carolina
North Dakota
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
South Dakota
Tennessee
Texas
Utah
Vermont
Virginia
Washington
West Virginia
Wisconsin
Wyoming
;
proc sort;
by deaths2015;
run;
ods html body='b.html' style=htmlblue image_dpi=300;
ods graphics on / height=12in width=5in;
proc sgplot nocycleattrs noautolegend;
title;
highlow y=state low=deaths2010 high=deaths2015;
scatter y=state x=deaths2015 / markerattrs=GraphDataDefault(symbol=circlefilled)
name='a' legendlabel='2015';
scatter y=state x=deaths2010 / markerattrs=(symbol=circlefilled)
filledoutlinedmarkers name='b' legendlabel='2010'
markerfillattrs=(color=white)
markeroutlineattrs=GraphDataDefault;
xaxis label='Deaths per 100,000 population';
keylegend 'a' 'b'/ location=inside across=1 position=bottomright noborder;
quit;
ods html close;
Here is a revised solution with a change to the artificial data to more clearly show a few points where the rate has gone down. This one plots high/low variables instead of the original.
data states;
deaths2010 = 25 * uniform(151);
deaths2015 = deaths2010 + 5 * (uniform(151) > 0.2) + normal(151);
low = min(deaths2010, deaths2015);
high = max(deaths2010, deaths2015);
input State $1-15;
datalines;
Alabama
Alaska
Arizona
Arkansas
California
Colorado
Connecticut
Delaware
Florida
Georgia
Hawaii
Idaho
Illinois
Indiana
Iowa
Kansas
Kentucky
Louisiana
Maine
Maryland
Massachusetts
Michigan
Minnesota
Mississippi
Missouri
Montana
Nebraska
Nevada
New Hampshire
New Jersey
New Mexico
New York
North Carolina
North Dakota
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
South Dakota
Tennessee
Texas
Utah
Vermont
Virginia
Washington
West Virginia
Wisconsin
Wyoming
;
proc sort; by deaths2015; run;
ods html body='b.html' style=htmlblue image_dpi=300;
ods graphics on / height=12in width=5in;
proc sgplot nocycleattrs noautolegend;
title;
highlow y=state low=low high=high;
scatter y=state x=deaths2015 / markerattrs=GraphDataDefault(symbol=circlefilled)
name='a' legendlabel='2015';
scatter y=state x=deaths2010 / markerattrs=(symbol=circlefilled)
filledoutlinedmarkers name='b' legendlabel='2010'
markerfillattrs=(color=white)
markeroutlineattrs=GraphDataDefault;
xaxis label='Deaths per 100,000 population';
keylegend 'a' 'b'/ location=inside across=1 position=bottomright noborder;
quit;
ods html close;
This is a known issue with HIGHLOW. High must be > Low. But, you can work around this by creating two new variables, "High" and "Low", and put the lower value in "Low" and higher value in "High". Then, use these variables in the graph. An alternative is to not use HIGHLOW, but use the SCATTER plots own xerror bars to draw the line.
The problem with creating variables as you’ve described above (reverse coding any lower 2015 obs as the lowvar and any higher 2010 obs as the highobs ) is hat I don’t know how to do this without losing the color coding specific to 2010 and 2015. 2015 should be dark blue regardless of where it sits on the chart. If you have a suggestion for this or can share some specific code that accomplishes the line drawing via xerror in scatter I’d be very grateful!
Using Warren's code as example, you could add XERRORLOWER=deaths2010 to the first scatter plot as follows. It does not matter whether the value of deaths2010 is lower of higher than deaths2015.
proc sgplot nocycleattrs noautolegend;
title;
scatter y=state x=deaths2015 / markerattrs=GraphDataDefault(symbol=circlefilled)
name='a' legendlabel='2015' xerrorlower=deaths2010;
scatter y=state x=deaths2010 / markerattrs=(symbol=circlefilled)
filledoutlinedmarkers name='b' legendlabel='2010'
markerfillattrs=(color=white)
markeroutlineattrs=GraphDataDefault;
xaxis label='Deaths per 100,000 population';
keylegend 'a' 'b'/ location=inside across=1 position=bottomright noborder;
For better control of the colors, you can use 3 separate SCATTER plots to display the line and the two markers. Use a SCATTER plot with the xErrorLower and xErrorUpper values with the two variables, and set the markerattrs(size=0) to prevent display of any marker. Then, use the other two SCATTER plots to display the 2010 and 2015 values with the right colors.
Here is sample with 3 scatter plots.
proc sgplot data=sashelp.cars(where=(type='Hybrid'));
scatter y=model x=mpg_city / xerrorlower=mpg_city xerrorupper=mpg_highway markerattrs=(size=0);
scatter y=model x=mpg_city / markerattrs=(symbol=circlefilled size=10 color=blue) legendlabel='City';
scatter y=model x=mpg_highway / markerattrs=(symbol=circlefilled size=10 color=red) legendlabel='Highway';
run;
Note, values for "Prius" are reversed.
Thank you, my friend! This worked excellently. I will try with the 3 scatter plots as well for better color options.
@Trp1 wrote:
Thank you for your response. My hunch is also that this is something SAS could do easily. And your response is so great, but unfortunately not quite what I need. I should have mentioned that I turned to the forum AFTER trying a hilow plot without success. The problem is that opioid rates did not increase in every state. some 2010 points are higher than 2015. In the cases where 2010>2015 there is no line drawn between the end caps. Is there any way to fix this? Note if you use my 3 obs of fake data it might be helpful as 1 case has 2010>2015. Thanks so much for taking a stab at this.
Are you using different data than the example graph? Note that in the example you post the link to there are a couple of points that have the 2010 > 2015 but the difference is so small that the points overlap and so you do not know if there was no line drawn between the two.
Is the purpose of the graph to show the connection between the years or the increase/decrease between the years? I find the lack of a line between the two quickly notice there is something different (if the majority show an increase) for those pairings.
Yes I’m using different data than in the graph and I definitely have obs where the decrease is not so small that the points overlap. So this is a problem for me. I couldn’t even change the scale to get around it.
I personally dont mind that the lines don’t show up for the obs where the change is negative. In fact I might prefer this without connecting lines at all. As I said, I would have displayed this data differently. But the point is replication. So my goal is to get a chart that is formatted like cdc’s as closely as possible.
Thanks so much for your thoughts.
Sorry. I posted it in a bit of a hurry. @Jay54 told you the easy fix.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.