For a long time when I first started using SAS/Graph, I ignored Annotate. I read that section of the manual (yes, I'm old; back in the day, it was printed in a book, on paper) and kept asking myself "Who cares? Who would want to do that?"
But one day, I actually tried to do things and found a real treasure in the annotate facility. YOU CAN USE IT TO DO ANYTHING.
I think the use of annotate is greatly hindered by the aseptic and insipid examples; what is really needed is examples that really show what annotate can do with data-driven, and not hard-coded, examples.
So, I thought I'd share. Attached is plot that I recently made. It is shows a graphic picture of a stratified ordinal data situation, essentially the Cochran-Mantel-Haenszel set-up. The plot is generated completely in annotate; the plot shown is one of 8 plots that are generated by the macro that draws the plot.
The data are ordinal outcome data where a subject can score between 0 and 10 (collapsed to four groups (0, 1-3, 4-6, 7-10)) on item XYZ at week 24. The distribution of the outcome scores across all the subjects is shown in the upper left by the four grey rectangles in the row labeled "All". In this row, we see that 45% of the subjects scored 0, 37% scored 1-3, 13% scored 4-6, and just 4% scored 7-10. The area of the grey rectangle is proportional to the outcome proportion.
But the subjects also had a baseline score in the study --- this facet of the study stratifies the subjects. The Greek word strata means "layer". Thus, our week 24 distribution is actually a mixture of distributions based on the baseline values. We can split apart (in the Greek, we can analyze) the week 24 distribution by the baseline score and we get the four rows below the "All" row. Here we see that, for those subjects who scored 0 at baseline, 81% of them scored 0 at week 24, 15% scored 1-3, 4% scored 4-6, and no one scored 7-10. But for subjects who scored 1-3 at baseline only 33% scored 0, while 54% scored 1-3, and so on. Similar details can be read for the 4-6 and 7-10 baseline categories.
Note that our graphic mirrors the study set-up: the strata (layers) in the study are shown as layers in the plot. "The data display is the model."
The heights of the rectangles in the baseline strata are proportion to the number of subjects in that stratum thus if you sum the areas of the grey boxes in a column vertically across that strata, that sum will be the area of the box at the top.
The larger rectangles down that main diagonal show that baseline is predictive of outcome; but they don't tell the whole story. We see that subjects tended to stay steady or drop in score between baseline and week 24, as the boxes at and below the main diagonal are bigger than those above the main diagonal.
But the study sought to compare two treatments: ABC and DEF. The red and blue boxes on the right side of the plot allow one to compare the distributions for the two treatments. The top row sums across all baseline levels and reveals 202 subjects assigned to ABC and 182 assigned to DEF. We see very similar outcomes between the treatments pooled across baseline levels but the strata reveal some nuances.
The red group (ABC) has 76 subjects in the 0 baseline group, compared to 64 in the blue group. The baseline 0 stratum shows that the red treatment is pushed slightly to the left of the blue group, as there is more mass in the 0 and 1-3 groups for red than for blue; that difference is filled by the excess blue subjects with a 4-6 score.
(As before, the blue and red bars are proportional to sample size and sum up in the columns across the rows.)
Other interesting differences can be parsed from the other strata.
A CMH test essentially compares the treatments within each stratum and then pools those comparisons to create the grand test. Our data display allows that to be seen in graphical form.
But all this is made possible through annotate. I just took the counts and percentages and told SAS to "go here; draw this; according to these data; according to these color rules". Everything here was done with just the "bar", "move", and "label" functions and it is all data driven. There is no hard coding to get the labels or boxes in the right places.
These are the kinds of examples that should be in the annotate documentation, not some lame bit of code that draws a star on top of a bar by hard coding the coordinates.
Anyway, I hope everyone has a great afternoon.
$0.02,
Rafe
Nice idea.
To make this really useful you should provide example data and the code.
Otherwise this is somewhat more confusing than the antiseptic minimalist examples in the documentation.
Having taught myself SAS/Graph from two very thick books back with SAS 6.8 I do appreciate there is some difficulty making the connection between simple code examples and real world examples.
But some of the things that absolutely needed annotate in some prior versions become simple plots in SGPLOT with the help of Text plot (for placing interesting text at coordinates) and others.
Agreed. The code is rather complex and lengthy and has too many confidential things inside to be able to include at the moment. A post that discusses the code and the data would take me even longer to develop than that last one did.
But I'll work on it; it just isn't going to be this week...
@rafeman wrote:
Agreed. The code is rather complex and lengthy and has too many confidential things inside to be able to include at the moment. A post that discusses the code and the data would take me even longer to develop than that last one did.
But I'll work on it; it just isn't going to be this week...
Just provide the Annotate data set and the variables used for the plot. Easy:
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.
Then the code used to display the data using that data.
Sanitize any sensitive variable names, labels or formats (if needed).
Or post to a GitHub repo, which seems to be the "hip" way to share complex projects that require multiple files.
It's always great seeing what can be produced by the various subsystems of the SAS graphics: SG procedures, GTL, and SG Annotation. I would have created those graphs by using the Graph Template Language (GTL), which you didn't mention, but I am fond of saying that SAS often provides several ways to obtain the same result. If straight annotation appeals to you, that is certainly a powerful tool for creating highly customizable graphs. It isn't clear to me whether you are using the traditional SAS/GRAPH annotation or the newer SGAnnotate facility.
A few additional comments and links to resources:
1. @GraphGuy has spent years posting complex examples that use ANNOTATE. He sometimes often uses hard-coded values for convenience (he is extremely prolific), but many of his examples could become data-driven by including a data-to-annotate step. In the past few years, he has stared using SG and SGAnnotate.
2. IMHO, the King of Data-driven Annotation is @WarrenKuhfeld . He has traveled around the world giving courses and writing books and papers about advanced ODS graphics. He uses programming to actually WRITE TEMPLATES and also incorporates annotation into his graphs. I highly recommend his book on Advanced ODS Graphics, when standard SG graphs and GTL does not meet your needs.
3. On a philosophical note, I think programmers should use the right tools for the right projects. If you do everything in ANNOTATE, you are always dealing with graphic primitives. In my work, I use the standard SG procedures or GTL when they apply and use the ANNOTATE facility to augment the graphs only when necessary.
Thanks for sharing.
Welcome to the "custom graph club" rafeman! The ability to customize, or even build totally custom graphs (as you did), using data-driven annotate is one of the big advantages of SAS, imho. 🙂
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.