Hi,
I have the following data and want to generate the attached graph. Can someone please help me?
data test;
input mech $ 2-56 freq 59-63 cumpct 64-67 pct80 68-70;
datalines;
"6. Errors that occur when using new workflows" 432 28 80
"1. Errors due to system configuration and functionality " 334 49 80
"5. Editing Errors" 323 70 80
"3. Selection Errors " 274 87 80
"4. Construction Errors" 110 94 80
"7. Errors Due to Hybrid Systems" 77 99 80
"2. Prescribing for wrong patient" 15 100 80
;
run;
I also need to create the variable cumpct (cumulative percentage) from the freq variable.
What is the sort order of your data?
Did you examine the actual range of your Count variable? It is much smaller than your example data set so the YAXIS range is way to big.
If you want to have a chance of displaying that much text in your format then you need a NEW format that inserts a character such as was used in SPLITCHAR in my example. Otherwise you will either not have any text displayed for some of the the values or possible not much graph because your text is so long there may not be much room for the body of graph after trying to display
"6.4 Errors due to misuse of actions when ordering discharge or outpatient prescriptions, or when ordering from medication history or using medication reconciliation functionality"
on any axis.
I think you may want to rethink just how much text goes onto the actual axis.
This creates plots that display. Details up to you for the axis label and ranges.
proc sort data=paretom; by period cum_pct; run; proc sgplot data=paretom noautolegend; by period; vbarbasic mech10 / response=count datalabel; refline 400 /axis=y label="80% line" labelattrs=(color=green) labelloc=inside; series x=mech10 y=cum_pct/ markers y2axis ; /* yaxis values=(0 to 500 by 50) label='Frequency';*/ y2axis values=(0 to 100 by 10) label='Cumulative %'; xaxis display=(nolabel) fitpolicy=split type=discrete discreteorder=data splitchar='*'; format mech10 fmec.; run;
You would have to provide the data as it existed before this to "compute the cumpct". Proc Freq if done correctly does such for most data sets but would need to see the preceding data.
If you want to actually display % characters in the axis then the values of cumpct should be 0 to 1 with a percent format and the y2axis definition should agree.
The actual graph uses changed text in your data set. One thing quotes as part of a value for display are almost always a poor idea as they take up print positions without adding value.
I insert * in the text to allow use of the SPLITCHAR option in the Xaxis statement to control text appearance.
Depending on your display size and settings you could further reduce text collision in the Xaxis by using a smaller font size in a LABELATTRS option or increase the size of the graphing area with ODS Graphics width and possibly height options.
data test; input mech $ 1-56 freq 59-63 cumpct 64-67 pct80 68-70; datalines; 6. Errors that*occur when*using new*workflows 432 28 80 1. Errors*due to*system*configuration*and*functionality 334 49 80 5. Editing*Errors 323 70 80 3. Selection*Errors 274 87 80 4. Construction*Errors 110 94 80 7. Errors*Due to*Hybrid Systems 77 99 80 2. Prescribing*for wrong*patient 15 100 80 ; run; proc sgplot data=test noautolegend; vbarbasic mech / response=freq datalabel ; refline 400 /axis=y label="80% line" labelattrs=(color=green) labelloc=inside ; series x=mech y=cumpct/ markers y2axis ; yaxis values=(0 to 500 by 50) label='Frequency'; y2axis values=(0 to 100 by 10) label='Cumulative %'; xaxis display=(nolabel) fitpolicy=split splitchar='*'; run;
I used REFLINE instead of drawing another series for reference line because the label is easier.
Hi,
Thanks for the code. Much appreciated.
I think, it would not be helpful to have data from the previous step to generate the cumulative percent variable. That is because, after creating the "Freq" variable using PROC FREQ we need to sort the data by the "Freq" variable in descending order and then calculate cumulative frequency and cumulative percent. We need to use some data steps to create those after sorting Freq.
Secondly, can we do the whole thing using a BY variable? I have the same data for 3 different years and want to overlay the three graphs on one panel.
No you don't need a data step, at least not from what I see.
Proc freq has an options ORDER = Freq, which will place things in frequency order.
Your can run this example with the training data set Sashelp.class to see:
proc freq data=sashelp.class order=freq; tables age /out=agecount outcum; run;
If you look at the output you will see the rows are in order of the highest frequency first and the cumulative percentages go along with that. The Outcum option with the Out= places the cumulative count and percentages into the output data set as well.
Age | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
12 | 5 | 26.32 | 5 | 26.32 |
14 | 4 | 21.05 | 9 | 47.37 |
15 | 4 | 21.05 | 13 | 68.42 |
13 | 3 | 15.79 | 16 | 84.21 |
11 | 2 | 10.53 | 18 | 94.74 |
16 | 1 | 5.26 | 19 | 100.00 |
If you need to do a more complicate combination it may be that Proc Freq won't quite work but this does.
@BayzidurRahman wrote:
Hi,
Thanks for the code. Much appreciated.
I think, it would not be helpful to have data from the previous step to generate the cumulative percent variable. That is because, after creating the "Freq" variable using PROC FREQ we need to sort the data by the "Freq" variable in descending order and then calculate cumulative frequency and cumulative percent. We need to use some data steps to create those after sorting Freq.
Secondly, can we do the whole thing using a BY variable? I have the same data for 3 different years and want to overlay the three graphs on one panel.
If you have a Year variable the use that in a BY statement in the proc freq code and yes the plot will accept a By statement.
Hi,
please see attached my Dataset that I created with BY variable (period). Can you please help me with the code? I am trying the following.
proc format;
value fmec
10 ="1. Errors due to system configuration and functionality "
11= "1.1 System malfunction "
12 ="1.2 System contains incorrect order sentence or other incorrect configuration "
13= "1.3 Limitation in system functionality"
20= "2. Prescribing for wrong patient"
30= "3. Selection Errors "
31= "3.1 Selection errors when ordering"
32= "3.2 Selection errors when constructing or editing an order "
40= "4. Construction Errors"
50= "5. Editing Errors"
51= "5.1 Editing errors (general)"
52= "5.2 Editing errors when using the dose calculator or recording patient weights"
53= "5.3 Editing errors that occur when correcting a previous TRE"
54= "5.4 Editing errors that occur when failing to edit default time/date "
55= "5.5 Editing errors that occur when misusing order actions on existing orders "
60= "6. Errors that occur when using new workflows"
61= "6.1 Failure to view the updated medication profile, active workspace, or medication chart prior to ordering"
62= "6.2 Errors that occur when prescribing via an order set "
63= "6.3 Failure to activate a future order, or failure to view planned/pending future order or current activated order "
64= "6.4 Errors due to misuse of actions when ordering discharge or outpatient prescriptions, or when ordering from medication history or using medication reconciliation functionality"
65= "6.5 Errors when using tasks and reminders"
66= "6.6 Other"
70= "7. Errors Due to Hybrid Systems"
71= "7.1 Errors occurring during initial system rollout (transition from paper to electronic)"
72= "7.2 Errors occurring during downtime"
73= "7.3 Errors occurring when paper charts are used for some prescribing "
74= "7.4 Errors occurring when different electronic systems operate within the same hospital";
data paretom; set paretom;
format mech10 fmec.; run;
proc sgplot data=paretom noautolegend; by period;
vbarbasic mech10 / response=count
datalabel;
refline 400 /axis=y label="80% line"
labelattrs=(color=green)
labelloc=inside;
series x=mech10 y=cum_pct/ markers y2axis ;
yaxis values=(0 to 500 by 50) label='Frequency';
y2axis values=(0 to 100 by 10) label='Cumulative %';
xaxis display=(nolabel) fitpolicy=split splitchar='*';
run;
What is the sort order of your data?
Did you examine the actual range of your Count variable? It is much smaller than your example data set so the YAXIS range is way to big.
If you want to have a chance of displaying that much text in your format then you need a NEW format that inserts a character such as was used in SPLITCHAR in my example. Otherwise you will either not have any text displayed for some of the the values or possible not much graph because your text is so long there may not be much room for the body of graph after trying to display
"6.4 Errors due to misuse of actions when ordering discharge or outpatient prescriptions, or when ordering from medication history or using medication reconciliation functionality"
on any axis.
I think you may want to rethink just how much text goes onto the actual axis.
This creates plots that display. Details up to you for the axis label and ranges.
proc sort data=paretom; by period cum_pct; run; proc sgplot data=paretom noautolegend; by period; vbarbasic mech10 / response=count datalabel; refline 400 /axis=y label="80% line" labelattrs=(color=green) labelloc=inside; series x=mech10 y=cum_pct/ markers y2axis ; /* yaxis values=(0 to 500 by 50) label='Frequency';*/ y2axis values=(0 to 100 by 10) label='Cumulative %'; xaxis display=(nolabel) fitpolicy=split type=discrete discreteorder=data splitchar='*'; format mech10 fmec.; run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.