- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Please I need to determine the distribution fitting for a quantitative variable (normal, lognormal, uniform, weibull, triangular, etc, ...) and required parameters to define it .
Is there any procedure that can combine all distributions together to test?
thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use PROC UNIVARIATE to create Q-Q plots. https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/procstat/procstat_univariate_syntax30.htm
Or you can use P-P plots, or Probability Plots, also in PROC UNIVARIATE.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank youu!
Does the proc freq do the same job for qualitative variable distribution, or there are any other procedures to consider?
Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@HS2 wrote:
Does the proc freq do the same job for qualitative variable distribution, or there are any other procedures to consider?
Yes, PROC FREQ will compute the distribution, and provides a Chi-Squared test to see if the proportions at each level are equal; it can also test to see if the proportions at each level are equal to a specified value. https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/procstat/procstat_freq_details08.htm#procstat...
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@HS2 wrote:
Thank youu!
Does the proc freq do the same job for qualitative variable distribution, or there are any other procedures to consider?
Thank you
If your qualitative data is character you may need to clean up spellings or build custom formats to standardize the values used by Proc Freq. The values "ABC" "Abc" "AbC" "aBC" <exhaust spelling case to continue> would each be a different level of the variable in Proc Freq.
Just to provide an example of one project I worked on. We were conducting a survey involving business names. We provided a list of likely answers and what should be entered in the responses by our data collection team. Result: 18 spellings for "IBM". My favorite was "I>B>M>" .
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It sounds like you want to do what some people call the "shotgun method," which is to fit a bunch of standard distributions and have the software tell you how well each model fits. You can do this by using PROC SEVERITY in SAS/ETS software. The Getting Started example in the doc shows how to carry out the analysis by using
DIST _PREDEFINED_; /* test 10 predefined distributions */
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello, Yes, this is exactly what I wanted.
It worked, thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
/*Or try PROC GENMOD
There are AIC , BIC which you can compare
with which distribution is more fitted data.
*/
proc genmod data=sashelp.heart;
model weight= /dist=normal;
run;
proc genmod data=sashelp.heart;
model weight= /dist=GAMMA;
run;
proc genmod data=sashelp.heart;
model weight= /dist=TWEEDIE;
run;
proc genmod data=sashelp.heart;
model weight= /dist=POISSON;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Usage Note 48914: Testing the fit of a discrete distribution
https://support.sas.com/kb/48/914.html
Usage Note 23135: Testing fit of continuous and discrete distributions to observed data
https://support.sas.com/kb/23/135.html
Base SAS Procedures Guide: Statistical Procedures
The UNIVARIATE Procedure
Goodness-of-Fit Tests
https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/procstat/procstat_univariate_details53.htm
Some distributions are not offered in the SAS procedures.
You can always use PROC OPTMODEL (SAS Optimization) to find its parameters.
You need to specify the formula for the distribution and its log-likelihood function (to find the maximum likelihood estimates of the parameters given a sample of observed data points).
Good luck,
Koen
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
For distributions that are not supported in SAS, I recommend using PROC NLMIXED for maximum loglikelihood estimation. An advantage of PROC NLMIXED is that it provides estimates of standard errors, confidence intervals, and p-values for hypothesis tests of the form Param=0. For an example and discussion, see
Two ways to compute maximum likelihood estimates in SAS - The DO Loop