- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to figure out the best way to get multiple histograms/densities on the same graph. The number of histograms/densities will be a dynamic number so I cannot hard code. I was hoping that the histogram/density statements took a "group" option similar to the series/scatter statements in sgplot, but that's a big negative.
Any simple solutions?
thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
In the past I created GREPLAY templates that would accommodate various numbers of histograms on the same page. The advantage is that when you run the graphs through the treplay template you do not have to assign all the panels.
Today, in 9.2 phase 2 I would create various templates using %sgdesign and then use macro language to determine the number of histograms and thus the appropriate template to use.
The issue is the min and max number of histograms that you want to display.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks!
Dan
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If there's no simple way to do this I'll make it work. Just thought I'd check.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
[pre]
%macro OverlayHist(ds);
proc contents data=&ds out=vars; run;
data _null_;
set vars end=_last_;
if (type = 1) then CALL SYMPUTX(CATS("varname",_N_),name);
if (_last_) then CALL SYMPUTX("dimvars",_N_);
run;
proc sgplot data=&ds;
%do i=1 %to &dimvars;
histogram &&varname&i / transparency=0.5;
%end;
run;
%mend;
/* Generate the output */
data class;
set sashelp.class (keep=weight height);
run;
%OverlayHist(class);
[/pre]
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
this is very easy on R, I am frustrated I can't just do it in SAS - I woudl expect paid software could do it
many histograms of the same variable by groups or class variables in the sas terminology - all in the same chart not below each other - sometimes you want below, sometimes not
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm not certain of the form of your data. Do you have multiple variables X1, X2, X3,.... or do you have one variable and a classification variable that define subpopulations (X, and Category)? Example data would be helpful.
I've written a few blog posts on this topic:
- For overlaying histograms of two variables, see Overlaying two histograms in SAS - The DO Loop
- For overlaying densities, see Overlay density estimates on a plot - The DO Loop
- If you have a classification variable that defines the groups, here's how you can change the data structure so that each category has it's own variable: Reshape data so that each category becomes a new variable - The DO Loop
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
My data is very simple, one row per person, each row has multiple columns among them one variable that defines the class it is a numeric value with a format, in addition there is another variable with the vaiable I want the histogram or density for say it is ATM withdrawals. I have seen your posts and I am sure they work, I am commenting as a SAS user who also uses R that it is unnecessarilly difficult in SAS to have to call template to do it when in R you just call the graph fnction and get it done - my point is that sgplot orsganel should be able to do it with some defined option.
What SAS does with my comment is really up to them, the can ignore and find one day we are all using R, or they can improve the procedures and maintain supremacy.
Why should I write complicated code when ic an choose simpe code
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I understand your concern. I made the same points when I first encountered this situation in late 2011, and I passed on my comments and suggestions to the SAS graphics group. I am sure that they are aware of these concerns. The next release of the ODS SG Procedures is SAS 9.4, which is scheduled for later this year.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
As Rick said, there are ways to do this that does not require GTL Template code. SGPLOT can do the overlaid histograms with a few lines of code once your data is in columns and SGPANEL can do the class panel. As we see more interest in creating grouped histograms, we can certainly add the GROUP or CLASS feature to the Histogram statement. In the meantime, here are a couple of links from previous articles.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
As Sanjay says, if you have individual variables, this is straightforward with SGPLOT. The complexities of my blogs posts were because I wanted to support a class (grouping) variable. Apparently the OP is in the same situation, which means that currently he needs to use some DATA step or SQL code to Reshape data so that each category becomes a new variable - The DO Loop
@GDA My recommendation is to convert your data from the "long" to the "wide" format as I describe in my "Reshape data" post, and then use Dan's %OverlayHist macro to replicate the HISTOGRAM statement in PROC SGPLOT as many times as you need.
There are those who would say that overlaying histograms does not work very well beyond 2-3 variables and that we should consider whether "comparative histograms" would visualize the data better, I say that I agree...usually. However, in certain special situations I've wanted to overlay dozens of histrograms. See Eigenvalues of a random symmetric matrix: A simulation approach
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Probably the easiest way to do it is to use a Proc Gplot 'needle' plot ... this makes it very-very easy to overlay the plots, and if you've got SAS 9.3 then you can even take advantage of alpha-transparent colors so that the overlapping bars get the combined-color effect (eg, yellow+blue=green).
Here's a simple example:
data foo; set sashelp.stocks
(where=(date>='01jan1990'd and date<'01jan1996'd and stock in ('IBM' 'Intel')));
run;
axis1 label=none order=('01jan1990'd to '01jan1996'd by year);
symbol1 value=none interpol=needle width=4 color=AFFE60077;
symbol2 value=none interpol=needle width=4 color=A42C0FB77;
title "Gplot 'needle' plots (with overlapping transparent colors)";
proc gplot data=foo;
plot close*date=stock / haxis=axis1 vzero;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Or if you've got separate variables, you can use the other gplot syntax (instead of plot y*x=z) ...
plot y1*x y2*x / overlay
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the help, I am not able to replicate what you did. I am confused as you seem to need 3 dimensions in your chart - the date, the stock and thh close, whereas I only have 2
Here is a summary of my data - the relevant columns only, I would like overlayed histograms/densities for each group of the Value - so in the y axis I would see how many HHID had 4 or 5 or 4-6 depending on the histogram bins. Overall I have 6 groups (amnd I may filter some) and values go from zero and there are quite a few of those to maybe 100 or so
OBS | HHID | Group | Value |
229 | 200007518 | Sheetz Only | 4 |
---|---|---|---|
415 | 200010892 | Sheetz Only | 2 |
755 | 200027228 | Sheetz and Any Other | 1 |
1071 | 200042500 | Sheetz and Any Other | 2 |
1089 | 200042790 | Sheetz and Any Other | 8 |
1177 | 200044180 | Sheetz and Any Other | 1 |
1244 | 200045652 | Sheetz and Any Other | 6 |
1955 | 200052426 | Sheetz and Any Other | 1 |
2019 | 200052940 | Sheetz and Any Other | 4 |
2159 | 200054454 | Sheetz and Any Other | 1 |
I am starting to see how the plot procedure works, but I maintain it is more complex than needed and so people do not use its full power, which is a shame
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Could you post up a copy of your R graph (or if you don't have an R graph, maybe one drawn by hand)?