BookmarkSubscribeRSS Feed
GDA
Calcite | Level 5 GDA
Calcite | Level 5
Hi All

I am trying to figure out the best way to get multiple histograms/densities on the same graph. The number of histograms/densities will be a dynamic number so I cannot hard code. I was hoping that the histogram/density statements took a "group" option similar to the series/scatter statements in sgplot, but that's a big negative.

Any simple solutions?

thanks
22 REPLIES 22
barheat
Fluorite | Level 6
Are you trying to show multiple histograms separately on the same page or overlay one histogram atop the other? I assume the first.

In the past I created GREPLAY templates that would accommodate various numbers of histograms on the same page. The advantage is that when you run the graphs through the treplay template you do not have to assign all the panels.

Today, in 9.2 phase 2 I would create various templates using %sgdesign and then use macro language to determine the number of histograms and thus the appropriate template to use.

The issue is the min and max number of histograms that you want to display.
DanH_sas
SAS Super FREQ
The easiest way to do this would be to use PROC SGPANEL. Put your "group" variable on the PANELBY statement and define your histogram as you would for SGPLOT. The procedure will create a histogram in a cell per group value. The procedure will also paginate to prevent the cells from getting too small, but you can override that behavior by specifying the ONEPANEL option on the PANELBY statement. Let me know if this is what you want.

Thanks!
Dan
GDA
Calcite | Level 5 GDA
Calcite | Level 5
Actually I am trying to overlay them on top of each other. I've used the sgpanel with the group and onepanel options and it's "ok" but not quite what I'm looking for. I'd really have all plots overlaid instead of individually created and in a matrix type of output.

If there's no simple way to do this I'll make it work. Just thought I'd check.
DanH_sas
SAS Super FREQ
The following code is one way you can do it:

[pre]
%macro OverlayHist(ds);
proc contents data=&ds out=vars; run;

data _null_;
set vars end=_last_;
if (type = 1) then CALL SYMPUTX(CATS("varname",_N_),name);
if (_last_) then CALL SYMPUTX("dimvars",_N_);
run;

proc sgplot data=&ds;
%do i=1 %to &dimvars;
histogram &&varname&i / transparency=0.5;
%end;
run;
%mend;


/* Generate the output */
data class;
set sashelp.class (keep=weight height);
run;
%OverlayHist(class);
[/pre]
mariosegal
Calcite | Level 5

this is very easy on R, I am frustrated I can't just do it in SAS - I woudl expect paid software could do it

many histograms of the same variable by groups or class variables in the sas terminology - all in the same chart not below each other - sometimes you want below, sometimes not

Rick_SAS
SAS Super FREQ

I'm not certain of the form of your data. Do you have multiple variables X1, X2, X3,.... or do you have one variable and a classification variable that define subpopulations (X, and Category)?  Example data would be helpful.

I've written a few blog posts on this topic:

mariosegal
Calcite | Level 5

My data is very simple, one row per person, each row has multiple columns among them one variable that defines the class it is a numeric value with a format, in addition there is another variable with the vaiable I want the histogram or density for say it is ATM withdrawals. I have seen your posts and I am sure they work, I am commenting as a SAS user who also uses R that it is unnecessarilly difficult in SAS to have to call template to do it when in R you just call the graph fnction and get it done - my point is that sgplot orsganel should be able to do it with some defined option.

What SAS does with my comment is really up to them, the can ignore and find one day we are all using R, or they can improve the procedures and maintain supremacy.

Why should I write complicated code when ic an choose simpe code


Rick_SAS
SAS Super FREQ

I understand your concern.  I made the same points when I first encountered this situation in late 2011, and I passed on my comments and suggestions to the SAS graphics group.  I am sure that they are aware of these concerns.  The next release of the ODS SG Procedures is SAS 9.4, which is scheduled for later this year.

Jay54
Meteorite | Level 14

As Rick said, there are ways to do this that does not require GTL Template code.  SGPLOT can do the overlaid histograms with a few lines of code once your data is in columns and SGPANEL can do the class panel.  As we see more interest in creating grouped histograms, we can certainly add the GROUP or CLASS feature to the Histogram statement.  In the meantime, here are a couple of links from previous articles.

  Graphs with class - Graphically Speaking

  Comparative density plots - Graphically Speaking

Rick_SAS
SAS Super FREQ

As Sanjay says, if you have individual variables, this is straightforward with SGPLOT.  The complexities of my blogs posts were because I wanted to support a class (grouping) variable. Apparently the OP is in the same situation, which means that currently he needs to use some DATA step or SQL code to   Reshape data so that each category becomes a new variable - The DO Loop

@GDA My recommendation is to convert your data from the "long" to the "wide" format as I describe in my "Reshape data" post, and then use Dan's %OverlayHist macro to replicate the HISTOGRAM statement in PROC SGPLOT as many times as you need.

There are those who would say that overlaying histograms does not work very well beyond 2-3 variables and that we should consider whether  "comparative histograms" would visualize the data better, I say that I agree...usually. However, in certain special situations I've wanted to overlay dozens of histrograms. See Eigenvalues of a random symmetric matrix: A simulation approach

GraphGuy
Meteorite | Level 14

Probably the easiest way to do it is to use a Proc Gplot 'needle' plot ... this makes it very-very easy to overlay the plots, and if you've got SAS 9.3 then you can even take advantage of alpha-transparent colors so that the overlapping bars get the combined-color effect (eg, yellow+blue=green).

Here's a simple example:

data foo; set sashelp.stocks

(where=(date>='01jan1990'd and date<'01jan1996'd and stock in ('IBM' 'Intel')));

run;

axis1 label=none order=('01jan1990'd to '01jan1996'd by year);

symbol1 value=none interpol=needle width=4 color=AFFE60077;

symbol2 value=none interpol=needle width=4 color=A42C0FB77;

title "Gplot 'needle' plots (with overlapping transparent colors)";

proc gplot data=foo;

plot close*date=stock / haxis=axis1 vzero;

run;

needle_plot.png

GraphGuy
Meteorite | Level 14

Or if you've got separate variables, you can use the other gplot syntax (instead of plot y*x=z) ...

plot y1*x y2*x / overlay

mariosegal
Calcite | Level 5

Thanks for the help, I am not able to replicate what you did. I am confused as you seem to need 3 dimensions in your chart - the date, the stock and thh close, whereas I only have 2

Here is a summary of my data - the relevant columns only, I would like overlayed histograms/densities for each group of the Value - so in the y axis I would see how many HHID had 4 or 5 or 4-6 depending on the histogram bins. Overall I have 6 groups (amnd I may filter some) and values go from zero and there are quite a few of those to maybe 100 or so

      
OBSHHIDGroupValue
229200007518Sheetz Only 4
415200010892Sheetz Only 2
755200027228Sheetz and Any Other 1
1071200042500Sheetz and Any Other 2
1089200042790Sheetz and Any Other 8
1177200044180Sheetz and Any Other 1
1244200045652Sheetz and Any Other 6
1955200052426Sheetz and Any Other 1
2019200052940Sheetz and Any Other 4
2159200054454Sheetz and Any Other 1

I am starting to see how the plot procedure works, but I maintain it is more complex than needed and so people do not use its full power, which is a shame


GraphGuy
Meteorite | Level 14

Could you post up a copy of your R graph (or if you don't have an R graph, maybe one drawn by hand)?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 22 replies
  • 14902 views
  • 0 likes
  • 8 in conversation