- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I am trying to get the histogram with density overlayed for the counts on Y axis and time on X-axis. However somehow, I am not able to get the density curve close to the histogram. Is density by default take the percent into consideration to plot the curve? Can you please help me on this?
x | Y |
1 | 830 |
2 | 155 |
3 | 65 |
4 | 45 |
5 | 52 |
6 | 35 |
7 | 20 |
8 | 15 |
9 | 10 |
10 | 5 |
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @mahi263 and welcome to the SAS Support Communities!
You need to specify the count variable in the FREQ= option of the DENSITY statement, as it wouldn't be used by default.
Example:
data have;
input x y;
cards;
1 830
2 155
3 65
4 45
5 52
6 35
7 20
8 15
9 10
10 5
;
proc sgplot data=have;
histogram x / freq=y scale=count binwidth=1;
density x / freq=y type=kernel(c=1.7);
xaxis values=(1 to 10) offsetmin=0.06 offsetmax=0.06;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @mahi263 and welcome to the SAS Support Communities!
You need to specify the count variable in the FREQ= option of the DENSITY statement, as it wouldn't be used by default.
Example:
data have;
input x y;
cards;
1 830
2 155
3 65
4 45
5 52
6 35
7 20
8 15
9 10
10 5
;
proc sgplot data=have;
histogram x / freq=y scale=count binwidth=1;
density x / freq=y type=kernel(c=1.7);
xaxis values=(1 to 10) offsetmin=0.06 offsetmax=0.06;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a lot, FreelanceReinh, really helps. Can you please confirm how do we define the kernel (1.7) option? Can you guide on this?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you try a few different values for that bandwidth, some greater than 1.7, some less than 1.7, you'll get the impression that some are too small (e.g., produce peaks around the integer values 1, 2, 3, ..., as if x were a discrete variable) and some are too large (i.e., deviate considerably from parts of the histogram). That's how I ended up with 1.7. But your knowledge of the data, the subject matter and the scientific literature may suggest a different bandwidth. Or even a different type of density: With a bit more programming effort you can overlay an arbitrary density curve (e.g., exponential, gamma, etc.) on a histogram. See Rick Wicklin's blog post "How to overlay a custom density curve on a histogram in SAS" for details.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a lot for the resolution. I'll go through the documentation.