Desktop productivity for business analysts and programmers

Re: How to smooth histogram graph

Reply
N/A
Posts: 0

Re: How to smooth histogram graph

I plot histogram using the Graph-N-Go function.

Y axis: Frequency (* of person)
X axis: Income

My income data comes in discrete manner. How can I make it into a bin of $1-$4999 , $5000-$9999, .... with respective frequency add up together into each relevant bin?

What is the Code relevant to change the graph after plotting with Graph-N-Go ?


Desired Graph:
|# of person
|
|
|
|
|
|______________________________
1-4999 5000-9999 10000-14999

Data Given:

ID Income
1 3000
2 2000
3 6000
4 11000
5 12000
. ..
. ..
. ..
5000 ..
SAS Super FREQ
Posts: 8,818

Re: How to smooth histogram graph

Hi:
Behind the scenes, Graph n Go is building SAS/Graph code. To see the code that's being built, right-mouse click in the histogram chart, and select Export. When the Export window pops up, select "Source file". From the source file window, you can either preview the code or save the code to an external file. I always save the code to an external file, so I have the statements that correspond to a histogram that I like.

I do not know how to change the bins within Graph n Go; I believe that Graph n Go sets the bins or categories based on whether the variable is numeric or character. This bin value is used for the number of bars -- in regular SAS/Graph syntax, these values are called "midpoint" values. For numeric variables, the midpoints are sort of arbitrary values picked by Graph n Go based on the range of values; for character variables, I think that Graph n Go picks the number of discrete values for the character variable for the number of bars.

However, the general way to set categories for a numeric variable is to create a user-defined format that sets a label or text to be used to set the "group" for the numeric variable. So, for example, if you have this format:
[pre]
** create a category format;
proc format;
value incfmt low-<15000 = 'Under 15K'
15000-< 30000 = '15K to 30K'
30000-high = 'Above 30K';
run;
[/pre]

Then you can use the format to set categories for what you want to graph. I would probably do this by creating a new character variable called INCHIST (inside a data step code block. To make the new character variable, I would use the PUT function so the new variable value would be the result of applying the format to the INCOME variable's value.
[pre]
data testhist(keep=id income inchist);
set origfile;
inchist = put(income,incfmt.);
output;
** will use inchist for SAS/Graph;
run;
[/pre]

Now, work.testhist has ID and income and the new character variable called INCHIST. You could go into Graph n Go and redo your Histogram for the new dataset and you will now have just 3 bins for this version of the data.

But, I might not go back into Graph n Go, if I had already saved the statements for the histogram. I would probably use the new dataset "work.testhist" for SAS/Graph and PROC GCHART, as shown below:
[pre]
ods html path='c:\temp' (url=none)
gpath='c:\temp' (url=none)
file='inchist.html' style=statdoc;
goptions reset=all device=actximg;

axis1 style=1 width=1 minor=none
label=("Counts" height=12pt justify=right);

axis2 style=1 width=1 minor=none minor=none
label=("Income Category" height=12pt
justify=center);

proc gchart data=testhist;
title 'Simple Frequency';
vbar3d inchist / discrete freq
type=freq
midpoints='Under 15K' '15K to 30K' 'Above 30K'
raxis=axis1 maxis=axis2;
run;
quit;

proc gchart data=testhist;
title 'Cumulative Frequency';
vbar3d inchist / discrete cfreq
type=cfreq
midpoints='Under 15K' '15K to 30K' 'Above 30K'
raxis=axis1 maxis=axis2;
run;
quit;
ods html close;

[/pre]
Inside Graph n Go, your only choices for changing the statistic for the Histogram object are 'Frequency' and 'Percentage'. But, with SAS/Graph syntax, you can also get the cumulative frequency (CFREQ). (If you switched to a BAR CHART instead of a HISTOGRAM in Graph n Go, you could ask for the cumulative frequency.)

The code shown above shows the difference between asking for the FREQ statistic and the CFREQ statistic.

Hope this is what you were looking for. For more specific help with Graph n Go, you might contact SAS Technical Support. I actually prefer the EG Graph tasks over Graph n Go because I like making my selections in the Graph task window (where I feel I have a bit more control) instead of right-clicking and changing properties (as you do with Graph n Go).

Good luck,
cynthia
Ask a Question
Discussion stats
  • 1 reply
  • 240 views
  • 0 likes
  • 2 in conversation