turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- distribution assumption using proc sgplot

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-04-2017 11:25 AM

I am using proc sgplot to plot CDF (series statement) and PDF (density statement) for a dataset.

Does it always require to choose the distribution assumption? e.g., type= normal

Is it possible to plot CDF and PDF witout underline distribution assumption? and if, how to achieve that?

Many thanks!

Accepted Solutions

Solution

05-12-2017
08:27 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Jonate_H

05-05-2017 08:55 AM

For continuous distributions, the easiest way is to use PROC UNIVARIATE to create the CDF and PDF plots. The HISTOGRAM statement fits and optionally overlays a nonparamwetric kernel density estimate. The CDFPLOT statement displays the empirical CDF. Here is an example:

```
proc univariate data=sashelp.cars;
var mpg_highway;
histogram mpg_highway / kernel; /* nonparametric density estimate */
cdfplot mpg_highway;
ods select Histogram CDFPlot;
run;
```

You can also fit and overlay parametric distributions. PROC UNIVARIATE supports about 20 common distributions. Here is an example of fitting lognormal distribution (maximum likelihood estimation) to the same data:

```
proc univariate data=sashelp.cars;
var mpg_highway;
histogram mpg_highway / lognormal; /* overlay PDF */
cdfplot mpg_highway / lognormal; /* overlay CDF */
ods select Histogram CDFPlot;
run;
```

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Jonate_H

05-04-2017 01:39 PM

Not sure if this is what you're asking, but you can plot densities and distributions in PROC SGPLOT like this

```
data normal;
do x = -3 to 3 by 0.01;
y_pdf = pdf('normal',x);
y_cdf = cdf('normal',x);
output;
end;
run;
title 'Normal Distribution';
proc sgplot data = normal;
band x = x upper = y_pdf lower = 0 / legendlabel = 'Density';
series x = x y = y_cdf / legendlabel = 'CDF';
keylegend / location = inside position = topleft across = 1;
yaxis label = 'Density/Probability';
xaxis label = 'x';
run;
title;
```

Solution

05-12-2017
08:27 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Jonate_H

05-05-2017 08:55 AM

For continuous distributions, the easiest way is to use PROC UNIVARIATE to create the CDF and PDF plots. The HISTOGRAM statement fits and optionally overlays a nonparamwetric kernel density estimate. The CDFPLOT statement displays the empirical CDF. Here is an example:

```
proc univariate data=sashelp.cars;
var mpg_highway;
histogram mpg_highway / kernel; /* nonparametric density estimate */
cdfplot mpg_highway;
ods select Histogram CDFPlot;
run;
```

You can also fit and overlay parametric distributions. PROC UNIVARIATE supports about 20 common distributions. Here is an example of fitting lognormal distribution (maximum likelihood estimation) to the same data:

```
proc univariate data=sashelp.cars;
var mpg_highway;
histogram mpg_highway / lognormal; /* overlay PDF */
cdfplot mpg_highway / lognormal; /* overlay CDF */
ods select Histogram CDFPlot;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Jonate_H

05-12-2017 02:54 PM

Thank you all!

by the way, how can I specify the distribution as t-distribution? without enough observations, I try to avoid normal distribution assumption.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Jonate_H

05-12-2017 03:25 PM

Typically data are not distributed as t. The t distribution arises as the sampling distribution of statistics. See the article "Why doesn't PROC UNIVARIATE support certain common distributions?"