Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Add a scatter plot on a bar chart

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-18-2008 08:23 AM

Hello,

I have 5 treatment groups of 10 subjects each.

I want to plot on 1 graph 2 informations:

- the mean of each group (with a bar chart)

- and a scatter plot of the 10 subjects per group.

I've tried to performed this using the PROC GBARLINE, but I cannot add the scatter plot .....

Do you know how can I manage this kind of graph ?

Thanks a lot for your answers.

I have 5 treatment groups of 10 subjects each.

I want to plot on 1 graph 2 informations:

- the mean of each group (with a bar chart)

- and a scatter plot of the 10 subjects per group.

I've tried to performed this using the PROC GBARLINE, but I cannot add the scatter plot .....

Do you know how can I manage this kind of graph ?

Thanks a lot for your answers.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deleted_user

04-18-2008 01:53 PM

Hello;

Here's an idea that you might like to try. This method adds statistical richness to the plot that you won't get with a bar showing the mean. In fact, the bar showing a mean will mask all data below the mean and add a lot of ink to the graph for just 1 data point (the mean). The proposal here does not show the individual data points but does show the distribution along with some quantile information. I imagine that the scatter data could be added with another statement or two.

data test;

input trt value;

datalines;

1 23

1 22

1 26

1 28

1 23

1 30

1 29

1 28

1 27

1 26

2 25

2 25

2 26

2 24

2 25

2 31

2 28

2 26

2 27

2 28

3 23

3 25

3 24

3 25

3 26

3 32

3 29

3 24

3 27

3 25

;

run;

goptions reset=all dev=win;

symbol1 l=1 i=boxj;

proc gplot data=test;

plot value*trt=1;

run;

quit;

Here's an idea that you might like to try. This method adds statistical richness to the plot that you won't get with a bar showing the mean. In fact, the bar showing a mean will mask all data below the mean and add a lot of ink to the graph for just 1 data point (the mean). The proposal here does not show the individual data points but does show the distribution along with some quantile information. I imagine that the scatter data could be added with another statement or two.

data test;

input trt value;

datalines;

1 23

1 22

1 26

1 28

1 23

1 30

1 29

1 28

1 27

1 26

2 25

2 25

2 26

2 24

2 25

2 31

2 28

2 26

2 27

2 28

3 23

3 25

3 24

3 25

3 26

3 32

3 29

3 24

3 27

3 25

;

run;

goptions reset=all dev=win;

symbol1 l=1 i=boxj;

proc gplot data=test;

plot value*trt=1;

run;

quit;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deleted_user

04-21-2008 12:05 PM

I'm not too keen on a BoxPlot for 10 data points. It's trying for too much information. I'd suggest a scatter plot where you plot the mean with a separate symbol from the data points.

You can do this by creating a new dataset that contains the original data and another column with just the mean for each group. You can most easily do this using the MEAN aggregate operator in PROC SQL, but you could also do it using PROC means, outputting the 5 records, and SETting the data back together. The you would use PROC GPLOT and the PLOT statement would include the /OVERLAY option, something like

PLOT raw*group mean*group/OVERLAY;

You will need to mess with your SYMBOL statements.

The downside of this method is that data with duplicate values of 'raw' disappear. "Jittering" adds some random scatter to the data so all the points still show. I wrote a macro for that some years ago (before SUGI proceedings were online). You can find the Mayo implementation by googling.

You can do this by creating a new dataset that contains the original data and another column with just the mean for each group. You can most easily do this using the MEAN aggregate operator in PROC SQL, but you could also do it using PROC means, outputting the 5 records, and SETting the data back together. The you would use PROC GPLOT and the PLOT statement would include the /OVERLAY option, something like

PLOT raw*group mean*group/OVERLAY;

You will need to mess with your SYMBOL statements.

The downside of this method is that data with duplicate values of 'raw' disappear. "Jittering" adds some random scatter to the data so all the points still show. I wrote a macro for that some years ago (before SUGI proceedings were online). You can find the Mayo implementation by googling

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deleted_user

04-23-2008 05:51 AM

Hi stat.

I've already had a similar question, and there are two ways to solve it :

1) using proc GCHART for the bar chart and adding an ANNOTATE dataset for the scatter plot

2) using a template graphics written in GTL (experimental in 9.1.3, production with a slightly different syntax in 9.2).

Sample codes for both, below.

[pre]

%LET table = work.test ;

%LET group = trt ;

%LET y = value ;

/*******************************************/

DATA work.anno (DROP = &group &y) ;

SET &table ;

xsys = "2" ; ysys = "2" ; when = "A" ; function = "SYMBOL" ; text = "PLUS" ;

xc = STRIP(&group) ;

y = &y ;

RUN ;

PROC GCHART DATA = &table ;

VBAR &group / DISCRETE TYPE = MEAN SUMVAR = &y ANNOTATE = work.anno ;

RUN ; QUIT ;

[/pre]

Note that you may have to define an AXIS statement to show all values. My suggestion is you collect max and min for your response variable in macro variables and use them in an AXIS definition.

2nd method, using GTL

[pre]

PROC TEMPLATE ;

DEFINE statGraph barScatter ;

DYNAMIC group y mean ;

LAYOUT OVERLAY ;

BARCHARTPARM X=group Y=mean ;

SCATTER X=group Y=y ;

ENDLAYOUT ;

END ;

RUN ;

PROC SQL ;

CREATE TABLE work.data AS

SELECT *, MEAN(&y) AS y_mean

FROM &table

GROUP BY &group

;

QUIT ;

ODS HTML GPATH="c:\temp" ;

DATA _NULL_ ;

SET work.data ;

FILE PRINT ODS=(TEMPLATE="barScatter" DYNAMIC=(group="&group" y="&y" mean="y_mean")) ;

PUT _ODS_ ;

RUN ;

ODS HTML CLOSE ;

[/pre]

Good luck !

Olivier

I've already had a similar question, and there are two ways to solve it :

1) using proc GCHART for the bar chart and adding an ANNOTATE dataset for the scatter plot

2) using a template graphics written in GTL (experimental in 9.1.3, production with a slightly different syntax in 9.2).

Sample codes for both, below.

[pre]

%LET table = work.test ;

%LET group = trt ;

%LET y = value ;

/*******************************************/

DATA work.anno (DROP = &group &y) ;

SET &table ;

xsys = "2" ; ysys = "2" ; when = "A" ; function = "SYMBOL" ; text = "PLUS" ;

xc = STRIP(&group) ;

y = &y ;

RUN ;

PROC GCHART DATA = &table ;

VBAR &group / DISCRETE TYPE = MEAN SUMVAR = &y ANNOTATE = work.anno ;

RUN ; QUIT ;

[/pre]

Note that you may have to define an AXIS statement to show all values. My suggestion is you collect max and min for your response variable in macro variables and use them in an AXIS definition.

2nd method, using GTL

[pre]

PROC TEMPLATE ;

DEFINE statGraph barScatter ;

DYNAMIC group y mean ;

LAYOUT OVERLAY ;

BARCHARTPARM X=group Y=mean ;

SCATTER X=group Y=y ;

ENDLAYOUT ;

END ;

RUN ;

PROC SQL ;

CREATE TABLE work.data AS

SELECT *, MEAN(&y) AS y_mean

FROM &table

GROUP BY &group

;

QUIT ;

ODS HTML GPATH="c:\temp" ;

DATA _NULL_ ;

SET work.data ;

FILE PRINT ODS=(TEMPLATE="barScatter" DYNAMIC=(group="&group" y="&y" mean="y_mean")) ;

PUT _ODS_ ;

RUN ;

ODS HTML CLOSE ;

[/pre]

Good luck !

Olivier