12-09-2015 06:59 AM - edited 12-09-2015 07:06 AM
I've been working on a forest plot and although it's in pretty good shape, there are some minor things I'm trying to fix:
What it should resemble (broadly - different data):
1. Is there a way to change the rectangles on the SAS graph to squares, as in the graph above? - ie, so when the weight is bigger, there's a bigger square rather than a wider rectangle?
Alternatively, is it possible to make the blue rectangles mostly transparent, so the line red line can be seen? Then I could add a dot using paint to indicate where the HG value is. (Also, is it possible to make thie whole graph greyscale?)
2. Is it possible to stretch the diamond so it starts at the lower CL and finishes at the UL? Looking at the code, I'm guessing there'd be no automated way to do this - would have to do it after the fact using paint.
Thanks for any help/pointers on where the code could be changed - have been playing around with it, but as a relatively new user, haven't gotten too far.
data forest; input Study $1-18 grp OddsRatio LowerCL UpperCL Weight; format weight percent5. Q1 Q3 4.2 oddsratio lowercl uppercl 5.3; ObsId=_N_; OR='HG'; LCL='LCL'; UCL='UCL'; WT='Weight'; if grp=1 then do; weight=weight*.01; Q1=OddsRatio-OddsRatio*weight; Q3=OddsRatio+OddsRatio*weight; lcl2=lowercl; ucl2=uppercl; end; else study2=study; datalines; Study1 1 -0.170 -0.901 0.560 9.9 Study2 1 0.354 -0.530 1.237 6.8 Study3 1 -0.204 -1.186 0.779 5.5 Study4 1 -0.848 -1.842 0.146 5.4 Study5 1 0.495 -0.179 1.169 11.6 Study 6 1 -0.058 -0.568 0.453 20.3 Study 7 1 -0.345 -1.191 0.500 7.4 Study 8 1 -0.272 -1.335 0.792 4.7 Study 9 1 0.000 -0.494 0.494 21.7 Study 10 1 0.479 -0.410 1.368 6.7 Overall 2 -0.010 -0.240 0.220 . ; run; proc sort data=forest out=forest2; by descending obsid; run; /* Add sequence numbers to each observation */ data forest3; set forest2 end=last; retain fmtname 'Study' type 'n'; studyvalue=_n_; if study2='Overall' then study2value=1; else study2value = .; /* Output values and formatted strings to data set */ label=study; start=studyvalue; end=studyvalue; output; if last then do; hlo='O'; label='Other'; end; run; /* Create the format from the data set */ proc format library=work cntlin=forest3; run; /* Apply the format to the study values and remove Overall from Study column. */ /* Compute the width of the box proportional to weight in log scale. */ data forest4; format studyvalue study2value study.; drop fmtname type label start end hlo pct; set forest3 (where=(studyvalue > 0)) nobs=nobs; /* Compute marker width */ c=0.5; /* Factor to adjust absolute marker width */ x1=oddsratio - c*weight; x2=oddsratio + c*weight; /* Compute top and bottom offsets */ if _n_ = nobs then do; pct=0.75/nobs; call symputx("pct", pct); call symputx("pct2", 2*pct); call symputx("count", nobs); end; run; ods listing close; ods html image_dpi=100 path="." file='sgplotforest.html'; ods graphics / reset width=600px height=400px imagename="Forest_Plot_Vector" imagefmt=gif; title "Outcome variable"; title2 h=8pt 'Hedges G and 95% CI'; proc sgplot data=forest4 noautolegend; scatter y=study2value x=oddsratio / markerattrs=graphdata2(symbol=diamondfilled size=10); scatter y=studyvalue x=oddsratio / xerrorupper=ucl2 xerrorlower=lcl2 markerattrs=graphdata1(symbol=squarefilled size=0); vector x=x2 y=studyvalue / xorigin=x1 yorigin=studyvalue lineattrs=graphdata1(thickness=8) noarrowheads; scatter y=studyvalue x=or / markerchar=oddsratio x2axis; scatter y=studyvalue x=lcl / markerchar=lowercl x2axis; scatter y=studyvalue x=ucl / markerchar=uppercl x2axis; scatter y=studyvalue x=wt / markerchar=weight x2axis; refline 0 / axis=x lineattrs=(pattern=shortdash) transparency=0.5; inset ' Favours Sham' / position=bottomleft; inset 'Favours Active' / position=bottom; xaxis offsetmin=0 offsetmax=0.35 min=-2 max=2 minor display=(nolabel) ; x2axis offsetmin=0.7 display=(noticks nolabel); yaxis display=(noticks nolabel) offsetmin=0.1 offsetmax=0.05 values=(1 to &count by 1); run; ods html close; ods listing;
12-09-2015 08:04 AM
You could just overlay another scatter, see example:
12-09-2015 12:58 PM
With SAS 9.3, you can use the GTL code linked by RW9. SG Scatter plot may not support this option.
With SAS 9.3, HighLow plot can be used to show the weight as the marker length, but with constant width.
If you must do this with SGPLOT, some people have used multiple overlaid scatter plot statements, each with its own marker size.
12-09-2015 08:15 PM
12-11-2015 12:45 PM
One of the classic references on good practices in statistical graphics, The Visual Display of Quantitative Information (2nd ed.) by Edward Tufte, recommends: "The number of information-carrying (variable) dimensions depicted should not exceed the number of dimensions in the data." (p. 71).
According to this principle it would be better to represent the weights by the lengths of rectangles (of fixed height), as you've done it already, rather than by the areas of squares. (E. Tufte shows a graph with circles whose sizes represent city populations as a bad example.) Indeed, it's easier for the human eye to recognize that one rectangle is, e.g., twice as long as another rectangle than to see that one square has twice the area of another square (i.e. a side length that is 1.4142... times as large).
As to the confidence interval of the overall population, I would depict it in a similar way as the others, maybe with thicker lines or in a different color. With a (slightly) stretched diamond it might be less obvious that the diamond width has that particular meaning.
To get rid of the colors you could add style=journal after ods html. If you want to fine-tune the grey scales (cf. http://www.amadeus.co.uk/sas-training/tips/6/1/99/50-shades-of-grey.php), I'm sure we could find a way to do this, too.