Graphics Programming

Data visualization using SAS programming, including ODS Graphics and SAS/GRAPH. Charts, plots, maps, and more!
BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
luvscandy27
Quartz | Level 8

I am in the process of recreating the graph below, which was originally created in Excel, using SAS. The graph appears as follows:

luvscandy27_0-1742904940966.png

In SAS Studio, I am encountering the following error: "ERROR: The same group variable must be used for summarized plots." Any assistance in resolving this issue would be greatly appreciated. Code and data are as follows:

 

/* Sample Data */
data incidence;
    input Group $ Subgroup $ CumulativeIncidence;
    datalines;
A 1-34A 0.54
A 1-34B 0.42
A 1-34C 0.46
A 1-34D 0.42
A 1-34E 0.37
B 3-34B 0.39
B 3-34C 0.35
B 3-34D 0.43
B 3-34E 0.48
C 3-39A 0.45
C 3-39B 0.44
C 3-39C 0.50
C 3-39D 0.41
C 3-39E 0.35
D 4-39A 0.20
D 4-39B 0.28
D 4-39C 0.29
D 4-39D 0.24
;
run;

/* Extract subgroup suffix for consistent coloring */
data incidence;
    set incidence;
    SubgroupSuffix = scan(Subgroup, -1, '-'); /* Extract last part (e.g., A, B, C) */
run;

/* Compute Group and Overall Averages */
proc means data=incidence noprint;
    by Group;
    var CumulativeIncidence;
    output out=group_avg mean=GroupAvg;
run;

proc means data=incidence noprint;
    var CumulativeIncidence;
    output out=overall_avg mean=OverallAvg;
run;

/* Add Overall Average to Each Row */
data overall_avg;
    if _n_ = 1 then set overall_avg;
    set incidence;
run;

/* Merge Group Average with the Data */
proc sort data=incidence; by Group; run;
proc sort data=group_avg; by Group; run;

data incidence_plot;
    merge incidence group_avg overall_avg;
    by Group;
    format GroupAvg 10.2 OverallAvg 10.2;
run;


/* Create the Plot */
proc sgplot data=incidence_plot;
    vbar Group / response=CumulativeIncidence
                group=SubgroupSuffix /* Color by subgroup suffix */
                groupdisplay=cluster
                datalabel /* Show values above bars */
                datalabelattrs=(size=10);
               
    vline Group / response=GroupAvg
                group=Group /* Use Group to plot the average correctly */
                lineattrs=(thickness=2 color=red)
                markers;
   
    refline OverallAvg / axis=y lineattrs=(thickness=2 color=blue pattern=shortdash);
   
    xaxis discreteorder=data label="Groups";
    yaxis label="Cumulative Incidence";
   
    keylegend / title="Subgroup (Color Key)" noborder;
   
    title "Cumulative Incidence by Group with Averages";
run;

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data incidence;
infile datalines truncover;
    input Group $ Subgroup $ CumulativeIncidence ref high low;
	Subgroup=compress(Subgroup,,'ka');
    datalines;
A 1-34A 0.54
A 1-34B 0.42
A 1-34C 0.46
A 1-34D 0.42
A 1-34E 0.37
B 3-34B 0.39
B 3-34C 0.35
B 3-34D 0.43
B 3-34E 0.48
C 3-39A 0.45
C 3-39B 0.44
C 3-39C 0.50
C 3-39D 0.41
C 3-39E 0.35
D 4-39A 0.20
D 4-39B 0.28
D 4-39C 0.29
D 4-39D 0.24
A . . . 0.5 0.5
B . . . 0.4 0.4
C . . . 0.45 0.45
D . . . 0.3 0.3
. . . 0.35 .
;
run;
proc sgplot data=incidence ;
vbarparm category=Group response=CumulativeIncidence/group=Subgroup groupdisplay=cluster 
   datalabel  name='bar';
refline ref/axis=y lineattrs=(pattern=dash color=black) legendlabel='Bridge Average' name='ref';
highlow x=Group high=high low=low/type=bar ;
keylegend 'bar'/location=outside position=s exclude=(' ');
legenditem name='highlow' type=LINE / label='Battery Average' lineattrs=(color=black) ;
keylegend 'highlow' 'ref'/location=inside position=ne across=1 ;
xaxis label='IET Average';
yaxis grid;
run;

Ksharp_0-1742959689329.png

 

View solution in original post

5 REPLIES 5
quickbluefish
Lapis Lazuli | Level 10

Not completely sure I understand what you're trying to achieve - partly because the way you've described the variable "subgroupSuffix" does not match how you've coded it:

* if you have subgroup='A-34B' ... ;
subgroupSuffix=scan(subgroup,-1,'-');
**... will yield '34B', not 'B' as you've indicated in your comment ;** if you really want the last character, do this: ;subgroupSuffix=substr(subgroup,length(subgroup));

Also, just FYI, you can shorten your code substantially (skipping all the proc means, sorting and extra data steps by doing something like this immediately below your input data step (though use CROSS JOIN with caution):

proc sql;
create table incidence_plot as
select a.*, scan(a.subgroup,-1,'-') as subgroupSuffix, b.groupavg, c.overallavg
from
	incidence A
	inner join
	(select group, mean(cumulativeincidence) as groupavg from incidence group by group) B
	on a.group=b.group
	cross join
	(select mean(cumulativeincidence) as overallavg from incidence) C
order by group, subgroup;
quit;

It seems like you are just trying to show, within each group (A-D), the group average.  I don't see an easy way to do what you're trying to do except to create a separate subgroup within each group for which the value is the group average.   This will simply create an additional bar for each group - however, it won't match the Excel version visually.  

Ksharp
Super User
data incidence;
infile datalines truncover;
    input Group $ Subgroup $ CumulativeIncidence ref high low;
	Subgroup=compress(Subgroup,,'ka');
    datalines;
A 1-34A 0.54
A 1-34B 0.42
A 1-34C 0.46
A 1-34D 0.42
A 1-34E 0.37
B 3-34B 0.39
B 3-34C 0.35
B 3-34D 0.43
B 3-34E 0.48
C 3-39A 0.45
C 3-39B 0.44
C 3-39C 0.50
C 3-39D 0.41
C 3-39E 0.35
D 4-39A 0.20
D 4-39B 0.28
D 4-39C 0.29
D 4-39D 0.24
A . . . 0.5 0.5
B . . . 0.4 0.4
C . . . 0.45 0.45
D . . . 0.3 0.3
. . . 0.35 .
;
run;
proc sgplot data=incidence ;
vbarparm category=Group response=CumulativeIncidence/group=Subgroup groupdisplay=cluster 
   datalabel  name='bar';
refline ref/axis=y lineattrs=(pattern=dash color=black) legendlabel='Bridge Average' name='ref';
highlow x=Group high=high low=low/type=bar ;
keylegend 'bar'/location=outside position=s exclude=(' ');
legenditem name='highlow' type=LINE / label='Battery Average' lineattrs=(color=black) ;
keylegend 'highlow' 'ref'/location=inside position=ne across=1 ;
xaxis label='IET Average';
yaxis grid;
run;

Ksharp_0-1742959689329.png

 

luvscandy27
Quartz | Level 8

Thank you very much for your assistance! This is exactly what I needed.

quickbluefish
Lapis Lazuli | Level 10
had no idea you could reference a character variable as the X value in a HIGHLOW statement!
Ksharp
Super User

If you want get rid of ERROR info, you should use VBARPARM instead of VBAR.
Check my code to see how to use VBARPARM statement.

 

Another alternative way is  using VBARBASIC to replace VBAR:

 vbar Group / response=CumulativeIncidence .......
--->
 vbarbasic Group / response=CumulativeIncidence ........

sas-innovate-white.png

Join us for our biggest event of the year!

Four days of inspiring keynotes, product reveals, hands-on learning opportunities, deep-dive demos, and peer-led breakouts. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 369 views
  • 6 likes
  • 3 in conversation