this is very easy on R, I am frustrated I can't just do it in SAS - I woudl expect paid software could do it
many histograms of the same variable by groups or class variables in the sas terminology - all in the same chart not below each other - sometimes you want below, sometimes not
I'm not certain of the form of your data. Do you have multiple variables X1, X2, X3,.... or do you have one variable and a classification variable that define subpopulations (X, and Category)? Example data would be helpful.
I've written a few blog posts on this topic:
My data is very simple, one row per person, each row has multiple columns among them one variable that defines the class it is a numeric value with a format, in addition there is another variable with the vaiable I want the histogram or density for say it is ATM withdrawals. I have seen your posts and I am sure they work, I am commenting as a SAS user who also uses R that it is unnecessarilly difficult in SAS to have to call template to do it when in R you just call the graph fnction and get it done - my point is that sgplot orsganel should be able to do it with some defined option.
What SAS does with my comment is really up to them, the can ignore and find one day we are all using R, or they can improve the procedures and maintain supremacy.
Why should I write complicated code when ic an choose simpe code
I understand your concern. I made the same points when I first encountered this situation in late 2011, and I passed on my comments and suggestions to the SAS graphics group. I am sure that they are aware of these concerns. The next release of the ODS SG Procedures is SAS 9.4, which is scheduled for later this year.
As Rick said, there are ways to do this that does not require GTL Template code. SGPLOT can do the overlaid histograms with a few lines of code once your data is in columns and SGPANEL can do the class panel. As we see more interest in creating grouped histograms, we can certainly add the GROUP or CLASS feature to the Histogram statement. In the meantime, here are a couple of links from previous articles.
As Sanjay says, if you have individual variables, this is straightforward with SGPLOT. The complexities of my blogs posts were because I wanted to support a class (grouping) variable. Apparently the OP is in the same situation, which means that currently he needs to use some DATA step or SQL code to Reshape data so that each category becomes a new variable - The DO Loop
@GDA My recommendation is to convert your data from the "long" to the "wide" format as I describe in my "Reshape data" post, and then use Dan's %OverlayHist macro to replicate the HISTOGRAM statement in PROC SGPLOT as many times as you need.
There are those who would say that overlaying histograms does not work very well beyond 2-3 variables and that we should consider whether "comparative histograms" would visualize the data better, I say that I agree...usually. However, in certain special situations I've wanted to overlay dozens of histrograms. See Eigenvalues of a random symmetric matrix: A simulation approach
Probably the easiest way to do it is to use a Proc Gplot 'needle' plot ... this makes it very-very easy to overlay the plots, and if you've got SAS 9.3 then you can even take advantage of alpha-transparent colors so that the overlapping bars get the combined-color effect (eg, yellow+blue=green).
Here's a simple example:
data foo; set sashelp.stocks
(where=(date>='01jan1990'd and date<'01jan1996'd and stock in ('IBM' 'Intel')));
run;
axis1 label=none order=('01jan1990'd to '01jan1996'd by year);
symbol1 value=none interpol=needle width=4 color=AFFE60077;
symbol2 value=none interpol=needle width=4 color=A42C0FB77;
title "Gplot 'needle' plots (with overlapping transparent colors)";
proc gplot data=foo;
plot close*date=stock / haxis=axis1 vzero;
run;
Or if you've got separate variables, you can use the other gplot syntax (instead of plot y*x=z) ...
plot y1*x y2*x / overlay
Thanks for the help, I am not able to replicate what you did. I am confused as you seem to need 3 dimensions in your chart - the date, the stock and thh close, whereas I only have 2
Here is a summary of my data - the relevant columns only, I would like overlayed histograms/densities for each group of the Value - so in the y axis I would see how many HHID had 4 or 5 or 4-6 depending on the histogram bins. Overall I have 6 groups (amnd I may filter some) and values go from zero and there are quite a few of those to maybe 100 or so
OBS | HHID | Group | Value |
229 | 200007518 | Sheetz Only | 4 |
---|---|---|---|
415 | 200010892 | Sheetz Only | 2 |
755 | 200027228 | Sheetz and Any Other | 1 |
1071 | 200042500 | Sheetz and Any Other | 2 |
1089 | 200042790 | Sheetz and Any Other | 8 |
1177 | 200044180 | Sheetz and Any Other | 1 |
1244 | 200045652 | Sheetz and Any Other | 6 |
1955 | 200052426 | Sheetz and Any Other | 1 |
2019 | 200052940 | Sheetz and Any Other | 4 |
2159 | 200054454 | Sheetz and Any Other | 1 |
I am starting to see how the plot procedure works, but I maintain it is more complex than needed and so people do not use its full power, which is a shame
Could you post up a copy of your R graph (or if you don't have an R graph, maybe one drawn by hand)?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.