BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.

Maybe I'm just having a bad Friday, but I'm struggling to make a stacked bar chart, and it feels like the kind of struggling that maybe I'm doing something dumb...

 

I have data with one record per person, with people from two states, and two years.  For each person, I have a boolean

variable that indicates pass/fail.  Data like:

 

data have ;
  input year state $2. Fail ;
  cards ;
2023 MA 1
2023 MA 1
2023 MA 0
2023 MA 0
2023 MA 0
2023 MA 0
2023 MA 0
2023 MA 0
2023 RI 1
2023 RI 0
2024 MA 1
2024 MA 1
2024 MA 1
2024 MA 1
2024 MA 0
2024 MA 0
2024 MA 0
2024 MA 0
2024 RI 1
2024 RI 1
;
run ;

So there are 10 people in each year.  My goal is to make a bar chart that shows that 30% failed in 2023 and 60% failed in 2024.  I want it to be a stacked bar chart, with state as the group.  It should show 30% failed in 2023 (20% MA, 10%RI) and 60%failed in 2024 (40% MA, 20%RI).

 

As I couldn't make the chart I want directly from this data (but would be happy if you can), I first thought I should calculate the percentages I want myself.  Even that feels overly complex, because I need to get the sum of the failures for each state-year, and then divide that by the count of rows for the year.  I can get what I want with SQL, but it feels like it should be easier...

 

proc sql ;
  create table want as
  select a.year,a.state,sum(fail)/n as pct
  from have as a
      ,(select year,n(fail) as n from have group by year) as b
  where a.year=b.year
  group by a.year,state
  ;
quit ;

That gives:

Obs    year    state    pct

 1     2023     MA      0.2
 2     2023     RI      0.1
 3     2024     MA      0.4
 4     2024     RI      0.2

Then to get my stacked bar chart:

proc sgplot data=want ;
  vbar  year/ response=pct group=state;
  format pct percent. ;
run;

Which gives me what I want:

Quentin_0-1708713097512.png

Is there a better/easier way to do this?

 

The point is not to compare the failure rate for MA vs RI.  I'm not calculating that percentage.  The goal is to compare the failure rate by year.  And in each year, see how many of those failures came from MA vs RI.

 

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
1 ACCEPTED SOLUTION

Accepted Solutions
yabwon
Onyx | Level 15

Small update on "missing class" colour:

data have2;
  set have have(in=i drop=state);
  if i then Fail = NOT Fail;
run;

data MyAttrMap;
  ID='grpMiss';
  value=' '; /* missing group */
  fillcolor='Black';
  filltransparency=1;
  output;
run;

proc sgplot 
  data=have2(where=(Fail)) 
  dattrmap=MyAttrMap 
  PCTLEVEL=group 
;
  vbar  year / group=state response=fail 
    STAT=PERCENT 
    GROUPORDER=DATA
    NOOUTLINE 
    MISSING 
    ATTRID=grpMiss
  ;
  yaxis max=0.6;
run;

yabwon_0-1708718587527.png

 

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



View solution in original post

7 REPLIES 7
Quentin
Super User

I usually stay away from stacked bar charts.  Maybe the line chart version is easier to read(and make):

 

data want ;
  set have ;
  FailMa=(State='MA' and Fail) ;
  FailRI=(State='RI' and Fail) ;
run ;

proc sgplot data=want ;
  vline year /response=fail stat=mean ;
  vline year /response=failMA stat=mean ;
  vline year /response=failRI stat=mean ;
run ;

Quentin_0-1708714769437.png

 

 

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
yabwon
Onyx | Level 15
data have ;
  input year state $2. Fail ;
  cards ;
2023 MA 1
2023 MA 1
2023 MA 0
2023 MA 0
2023 MA 0
2023 MA 0
2023 MA 0
2023 MA 0
2023 RI 1
2023 RI 0
2024 MA 1
2024 MA 1
2024 MA 1
2024 MA 1
2024 MA 0
2024 MA 0
2024 MA 0
2024 MA 0
2024 RI 1
2024 RI 1
;
run ;

data have2;
  set have;
  state=".."; /* "extra" state to get 100% */
  Fail = NOT Fail;
run;

data have3;
  set have have2;
run;

ods graphics / reset attrpriority=color;
proc sgplot data=have3(where=(Fail)) PCTLEVEL=group ;
  styleattrs datacolors=(red blue white); /* last value white to hide in background */
  vbar  year / group=state 
  RESPONSE=fail 
  STAT=PERCENT 
  GROUPORDER=DATA /* to keep "extra" group "on top" */
  NOOUTLINE;
  yaxis max=0.6; /* to scale (optionally)*/
run;

yabwon_0-1708717039481.png

 

 

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



yabwon
Onyx | Level 15

Small update on "missing class" colour:

data have2;
  set have have(in=i drop=state);
  if i then Fail = NOT Fail;
run;

data MyAttrMap;
  ID='grpMiss';
  value=' '; /* missing group */
  fillcolor='Black';
  filltransparency=1;
  output;
run;

proc sgplot 
  data=have2(where=(Fail)) 
  dattrmap=MyAttrMap 
  PCTLEVEL=group 
;
  vbar  year / group=state response=fail 
    STAT=PERCENT 
    GROUPORDER=DATA
    NOOUTLINE 
    MISSING 
    ATTRID=grpMiss
  ;
  yaxis max=0.6;
run;

yabwon_0-1708718587527.png

 

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Quentin
Super User

Very creative solution @yabwon , I would have never thought of adding fake data to let vbar calculate the percentage I want via percentage of sum.  As Paul D. would say, thanks 1E6!

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
FreelanceReinh
Jade | Level 19

@Quentin wrote:

... I should calculate the percentages I want myself.  Even that feels overly complex, ...


The percentages could also be calculated with PROC FREQ:

proc freq data=have noprint;
by year;
tables state*fail / out=want1(where=(fail));
run;

But then a format such as

proc format;
picture pctfmt low-high='009%';
run;

would be needed to get the percent signs into the tick mark labels. Also, the label of variable PERCENT ("Percent of Total Frequency") becomes the new default y-axis label.

Quentin
Super User

Thanks @FreelanceReinh .  I had mucked about with PROC TABULATE for a bit, but I'm trash at calculating percentages with PROC TABULATE.  I never even thought to try PROC FREQ with a BY statement.  Much better than my SQL step.

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
jeanleid
Fluorite | Level 6

You could also make some small modifications to your proc sql step and then use VBARPARM:


proc sql ;
create table want as
select a.year,a.state,sum(fail)/n as pct "% failed" FORMAT=PERCENT8.
from have as a
,(select year,n(fail) as n from have group by year) as b
where a.year=b.year
group by a.year,state
;
quit ; 


PROC SGPLOT DATA= WANT NOBORDER ;
styleattrs datacolors=(darkred darkblue) datacontrastcolors=(black) ;
vbarparm category= year response= pct / group= state seglabel seglabelattrs=(color=white weight=bold);
run ;

This gives you a stacked bar chart with each bar segment labelled with the percentage values from PROC SQL.

(Sorry- I'm new to the SAS community and not quite sure how to paste in the resulting graph.)

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 589 views
  • 6 likes
  • 4 in conversation