- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am in the process of recreating the graph below, which was originally created in Excel, using SAS. The graph appears as follows:
In SAS Studio, I am encountering the following error: "ERROR: The same group variable must be used for summarized plots." Any assistance in resolving this issue would be greatly appreciated. Code and data are as follows:
/* Sample Data */
data incidence;
input Group $ Subgroup $ CumulativeIncidence;
datalines;
A 1-34A 0.54
A 1-34B 0.42
A 1-34C 0.46
A 1-34D 0.42
A 1-34E 0.37
B 3-34B 0.39
B 3-34C 0.35
B 3-34D 0.43
B 3-34E 0.48
C 3-39A 0.45
C 3-39B 0.44
C 3-39C 0.50
C 3-39D 0.41
C 3-39E 0.35
D 4-39A 0.20
D 4-39B 0.28
D 4-39C 0.29
D 4-39D 0.24
;
run;
/* Extract subgroup suffix for consistent coloring */
data incidence;
set incidence;
SubgroupSuffix = scan(Subgroup, -1, '-'); /* Extract last part (e.g., A, B, C) */
run;
/* Compute Group and Overall Averages */
proc means data=incidence noprint;
by Group;
var CumulativeIncidence;
output out=group_avg mean=GroupAvg;
run;
proc means data=incidence noprint;
var CumulativeIncidence;
output out=overall_avg mean=OverallAvg;
run;
/* Add Overall Average to Each Row */
data overall_avg;
if _n_ = 1 then set overall_avg;
set incidence;
run;
/* Merge Group Average with the Data */
proc sort data=incidence; by Group; run;
proc sort data=group_avg; by Group; run;
data incidence_plot;
merge incidence group_avg overall_avg;
by Group;
format GroupAvg 10.2 OverallAvg 10.2;
run;
/* Create the Plot */
proc sgplot data=incidence_plot;
vbar Group / response=CumulativeIncidence
group=SubgroupSuffix /* Color by subgroup suffix */
groupdisplay=cluster
datalabel /* Show values above bars */
datalabelattrs=(size=10);
vline Group / response=GroupAvg
group=Group /* Use Group to plot the average correctly */
lineattrs=(thickness=2 color=red)
markers;
refline OverallAvg / axis=y lineattrs=(thickness=2 color=blue pattern=shortdash);
xaxis discreteorder=data label="Groups";
yaxis label="Cumulative Incidence";
keylegend / title="Subgroup (Color Key)" noborder;
title "Cumulative Incidence by Group with Averages";
run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data incidence;
infile datalines truncover;
input Group $ Subgroup $ CumulativeIncidence ref high low;
Subgroup=compress(Subgroup,,'ka');
datalines;
A 1-34A 0.54
A 1-34B 0.42
A 1-34C 0.46
A 1-34D 0.42
A 1-34E 0.37
B 3-34B 0.39
B 3-34C 0.35
B 3-34D 0.43
B 3-34E 0.48
C 3-39A 0.45
C 3-39B 0.44
C 3-39C 0.50
C 3-39D 0.41
C 3-39E 0.35
D 4-39A 0.20
D 4-39B 0.28
D 4-39C 0.29
D 4-39D 0.24
A . . . 0.5 0.5
B . . . 0.4 0.4
C . . . 0.45 0.45
D . . . 0.3 0.3
. . . 0.35 .
;
run;
proc sgplot data=incidence ;
vbarparm category=Group response=CumulativeIncidence/group=Subgroup groupdisplay=cluster
datalabel name='bar';
refline ref/axis=y lineattrs=(pattern=dash color=black) legendlabel='Bridge Average' name='ref';
highlow x=Group high=high low=low/type=bar ;
keylegend 'bar'/location=outside position=s exclude=(' ');
legenditem name='highlow' type=LINE / label='Battery Average' lineattrs=(color=black) ;
keylegend 'highlow' 'ref'/location=inside position=ne across=1 ;
xaxis label='IET Average';
yaxis grid;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Not completely sure I understand what you're trying to achieve - partly because the way you've described the variable "subgroupSuffix" does not match how you've coded it:
* if you have subgroup='A-34B' ... ;
subgroupSuffix=scan(subgroup,-1,'-');
**... will yield '34B', not 'B' as you've indicated in your comment ;** if you really want the last character, do this: ;subgroupSuffix=substr(subgroup,length(subgroup));
Also, just FYI, you can shorten your code substantially (skipping all the proc means, sorting and extra data steps by doing something like this immediately below your input data step (though use CROSS JOIN with caution):
proc sql;
create table incidence_plot as
select a.*, scan(a.subgroup,-1,'-') as subgroupSuffix, b.groupavg, c.overallavg
from
incidence A
inner join
(select group, mean(cumulativeincidence) as groupavg from incidence group by group) B
on a.group=b.group
cross join
(select mean(cumulativeincidence) as overallavg from incidence) C
order by group, subgroup;
quit;
It seems like you are just trying to show, within each group (A-D), the group average. I don't see an easy way to do what you're trying to do except to create a separate subgroup within each group for which the value is the group average. This will simply create an additional bar for each group - however, it won't match the Excel version visually.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data incidence;
infile datalines truncover;
input Group $ Subgroup $ CumulativeIncidence ref high low;
Subgroup=compress(Subgroup,,'ka');
datalines;
A 1-34A 0.54
A 1-34B 0.42
A 1-34C 0.46
A 1-34D 0.42
A 1-34E 0.37
B 3-34B 0.39
B 3-34C 0.35
B 3-34D 0.43
B 3-34E 0.48
C 3-39A 0.45
C 3-39B 0.44
C 3-39C 0.50
C 3-39D 0.41
C 3-39E 0.35
D 4-39A 0.20
D 4-39B 0.28
D 4-39C 0.29
D 4-39D 0.24
A . . . 0.5 0.5
B . . . 0.4 0.4
C . . . 0.45 0.45
D . . . 0.3 0.3
. . . 0.35 .
;
run;
proc sgplot data=incidence ;
vbarparm category=Group response=CumulativeIncidence/group=Subgroup groupdisplay=cluster
datalabel name='bar';
refline ref/axis=y lineattrs=(pattern=dash color=black) legendlabel='Bridge Average' name='ref';
highlow x=Group high=high low=low/type=bar ;
keylegend 'bar'/location=outside position=s exclude=(' ');
legenditem name='highlow' type=LINE / label='Battery Average' lineattrs=(color=black) ;
keylegend 'highlow' 'ref'/location=inside position=ne across=1 ;
xaxis label='IET Average';
yaxis grid;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much for your assistance! This is exactly what I needed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you want get rid of ERROR info, you should use VBARPARM instead of VBAR.
Check my code to see how to use VBARPARM statement.
Another alternative way is using VBARBASIC to replace VBAR:
vbar Group / response=CumulativeIncidence ....... ---> vbarbasic Group / response=CumulativeIncidence ........