BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
BayzidurRahman
Obsidian | Level 7

Hi,
I have the following data and want to generate the attached graph. Can someone please help me?

data test;
input mech $ 2-56 freq 59-63  cumpct 64-67 pct80 68-70;
datalines;
"6. Errors that occur when using new workflows"             432  28 80
"1. Errors due to system configuration and functionality " 334  49 80
"5. Editing Errors"                                        323  70 80
"3. Selection Errors "                                     274  87 80
"4. Construction Errors"                                   110  94 80
"7. Errors Due to Hybrid Systems"                           77   99 80
"2. Prescribing for wrong patient"                          15  100 80
;
run;

BayzidurRahman_0-1697495235723.png

I also need to create the variable cumpct (cumulative percentage) from the freq variable.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

What is the sort order of your data?

Did you examine the actual range of your Count variable? It is much smaller than your example data set so the YAXIS range is way to big.

 

If you want to have a chance of displaying that much text in your format then you need a NEW format that inserts a character such as was used in SPLITCHAR in my example. Otherwise you will either not have any text displayed for some of the the values or possible not much graph because your text is so long there may not be much room for the body of graph after trying to display

"6.4 Errors due to misuse of actions when ordering discharge or outpatient prescriptions, or when ordering from medication history or using medication reconciliation functionality"

 on any axis.

I think you may want to rethink just how much text goes onto the actual axis.

 

This creates plots that display. Details up to you for the axis label and ranges.

proc sort data=paretom;
  by period cum_pct;
run;

proc sgplot data=paretom noautolegend; 
   by period;
   vbarbasic mech10 / response=count
                    datalabel;
   refline 400 /axis=y label="80% line" 
                labelattrs=(color=green)
                labelloc=inside; 
   series x=mech10 y=cum_pct/ markers y2axis ;
/*   yaxis values=(0 to 500 by 50) label='Frequency';*/
   y2axis values=(0 to 100 by 10) label='Cumulative %';
   xaxis display=(nolabel) fitpolicy=split type=discrete discreteorder=data  splitchar='*';
   format mech10 fmec.;
run;

View solution in original post

5 REPLIES 5
ballardw
Super User

You would have to provide the data as it existed before this to "compute the cumpct". Proc Freq if done correctly does such for most data sets but would need to see the preceding data.

 

If you want to actually display % characters in the axis then the values of cumpct should be 0 to 1 with a percent format and the y2axis definition should agree.

 

The actual graph uses changed text in your data set. One thing quotes as part of a value for display are almost always a poor idea as they take up print positions without adding value.

I insert * in the text to allow use of the SPLITCHAR option in the Xaxis statement to control text appearance.

Depending on your display size and settings you could further reduce text collision in the Xaxis by using a smaller font size in a LABELATTRS option or increase the size of the graphing area with ODS Graphics width and possibly height options.

data test;
input mech $ 1-56 freq 59-63  cumpct 64-67 pct80 68-70;
datalines;
6. Errors that*occur when*using new*workflows             432  28  80
1. Errors*due to*system*configuration*and*functionality   334  49  80
5. Editing*Errors                                         323  70  80
3. Selection*Errors                                       274  87  80
4. Construction*Errors                                    110  94  80
7. Errors*Due to*Hybrid Systems                           77   99  80
2. Prescribing*for wrong*patient                          15   100 80
;
run;


proc sgplot data=test noautolegend;
   vbarbasic mech / response=freq 
                    datalabel
   ;
   refline 400 /axis=y label="80% line" 
                labelattrs=(color=green)
                labelloc=inside
                
   ; 
   series x=mech y=cumpct/ markers y2axis ;
   yaxis values=(0 to 500 by 50) label='Frequency';
   y2axis values=(0 to 100 by 10) label='Cumulative %';
   xaxis display=(nolabel) fitpolicy=split splitchar='*';
run;

I used REFLINE instead of drawing another series for reference line because the label is easier.

BayzidurRahman
Obsidian | Level 7

Hi,
Thanks for the code. Much appreciated.
I think, it would not be helpful to have data from the previous step to generate the cumulative percent variable. That is because, after creating the "Freq" variable using PROC FREQ we need to sort the data by the "Freq" variable in descending order and then calculate cumulative frequency and cumulative percent. We need to use some data steps to create those after sorting Freq.
Secondly, can we do the whole thing using a BY variable? I have the same data for 3 different years and want to overlay the three graphs on one panel.

ballardw
Super User

No you don't need a data step, at least not from what I see.

Proc freq has an options ORDER = Freq, which will place things in frequency order.

Your can run this example with the training data set Sashelp.class to see:

proc freq data=sashelp.class order=freq;
   tables age /out=agecount outcum;
run;

If you look at the output you will see the rows are in order of the highest frequency first and the cumulative percentages go along with that. The Outcum option with the Out= places the cumulative count and percentages into the output data set as well.

Age Frequency Percent Cumulative
Frequency
Cumulative
Percent
12 5 26.32 5 26.32
14 4 21.05 9 47.37
15 4 21.05 13 68.42
13 3 15.79 16 84.21
11 2 10.53 18 94.74
16 1 5.26 19 100.00

 

If you need to do a more complicate combination it may be that Proc Freq won't quite work but this does.

 


@BayzidurRahman wrote:

Hi,
Thanks for the code. Much appreciated.
I think, it would not be helpful to have data from the previous step to generate the cumulative percent variable. That is because, after creating the "Freq" variable using PROC FREQ we need to sort the data by the "Freq" variable in descending order and then calculate cumulative frequency and cumulative percent. We need to use some data steps to create those after sorting Freq.
Secondly, can we do the whole thing using a BY variable? I have the same data for 3 different years and want to overlay the three graphs on one panel.


 

If you have a Year variable the use that in a BY statement in the proc freq code and yes the plot will accept a By statement.

 

BayzidurRahman
Obsidian | Level 7

Hi,
please see attached my Dataset that I created with BY variable (period). Can you please help me with the code? I am trying the following.

proc format;
     value fmec
       10 ="1. Errors due to system configuration and functionality "	
11= "1.1 System malfunction "	
12 ="1.2 System contains incorrect order sentence or other incorrect configuration "	
13= "1.3 Limitation in system functionality"	
20= "2. Prescribing for wrong patient"	
30= "3. Selection Errors "	
31= "3.1 Selection errors when ordering"	
32= "3.2 Selection errors when constructing or editing an order "	
40= "4. Construction Errors"	
50= "5. Editing Errors"	
51= "5.1 Editing errors (general)"	
52= "5.2 Editing errors when using the dose calculator or recording patient weights"	
53= "5.3 Editing errors that occur when correcting a previous TRE"	
54= "5.4 Editing errors that occur when failing to edit default time/date "	
55= "5.5 Editing errors that occur when misusing order actions on existing orders "	
60= "6. Errors that occur when using new workflows"	
61= "6.1 Failure to view the updated medication profile, active workspace, or medication chart prior to ordering"	
62= "6.2 Errors that occur when prescribing via an order set "	
63= "6.3 Failure to activate a future order, or failure to view planned/pending future order or current activated order "	
64= "6.4 Errors due to misuse of actions when ordering discharge or outpatient prescriptions, or when ordering from medication history or using medication reconciliation functionality"	
65= "6.5 Errors when using tasks and reminders"	
66= "6.6 Other"	
70= "7. Errors Due to Hybrid Systems"	
71= "7.1 Errors occurring during initial system rollout (transition from paper to electronic)"	
72= "7.2 Errors occurring during downtime"	
73= "7.3 Errors occurring when paper charts are used for some prescribing "	
74= "7.4 Errors occurring when different electronic systems operate within the same hospital";

data paretom; set paretom;
format mech10 fmec.; run;

proc sgplot data=paretom noautolegend; by period;
   vbarbasic mech10 / response=count
                    datalabel;
   refline 400 /axis=y label="80% line" 
                labelattrs=(color=green)
                labelloc=inside; 
   series x=mech10 y=cum_pct/ markers y2axis ;
   yaxis values=(0 to 500 by 50) label='Frequency';
   y2axis values=(0 to 100 by 10) label='Cumulative %';
   xaxis display=(nolabel) fitpolicy=split splitchar='*';
run;
ballardw
Super User

What is the sort order of your data?

Did you examine the actual range of your Count variable? It is much smaller than your example data set so the YAXIS range is way to big.

 

If you want to have a chance of displaying that much text in your format then you need a NEW format that inserts a character such as was used in SPLITCHAR in my example. Otherwise you will either not have any text displayed for some of the the values or possible not much graph because your text is so long there may not be much room for the body of graph after trying to display

"6.4 Errors due to misuse of actions when ordering discharge or outpatient prescriptions, or when ordering from medication history or using medication reconciliation functionality"

 on any axis.

I think you may want to rethink just how much text goes onto the actual axis.

 

This creates plots that display. Details up to you for the axis label and ranges.

proc sort data=paretom;
  by period cum_pct;
run;

proc sgplot data=paretom noautolegend; 
   by period;
   vbarbasic mech10 / response=count
                    datalabel;
   refline 400 /axis=y label="80% line" 
                labelattrs=(color=green)
                labelloc=inside; 
   series x=mech10 y=cum_pct/ markers y2axis ;
/*   yaxis values=(0 to 500 by 50) label='Frequency';*/
   y2axis values=(0 to 100 by 10) label='Cumulative %';
   xaxis display=(nolabel) fitpolicy=split type=discrete discreteorder=data  splitchar='*';
   format mech10 fmec.;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2308 views
  • 1 like
  • 2 in conversation