I wish to simulate(generate) a list of random numbers, where I specify the number of observations that I wish, the underlying distribution(normal,lognormal), the mean, the std. deviation and I want as output a list of random numbers from this distribution.
Further I wish to draw a probability plot of these numbers, conforming to stathe distribution that they were drawn from.
How to do it in SAS?
This is found in Chapter 2 of Simulating Data with SAS. In the DATA step, use the RAND function within a loop to generate the data. Use the PROPPLOT statement in the UNIVARIATE procedure to construct the probability plot:
data Random;
N = 100; /* sample size */
family = "Normal"; /* or "Lognormal" */
mean = 8;
stdDev = 3;
do i = 1 to N;
x = rand(family, mean, stdDev);
output;
end;
run;
/* draw propability plot */
proc univariate data=Random;
var x;
probplot x / square normal; /* or lognormal; optionally specify popluation parameters */
run;
This book is essential for what you want to do. Absolutely recommended!
This might be a good place to start:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000202908.htm
The example in the "Details" section is pretty close to one of your objectives.
This is found in Chapter 2 of Simulating Data with SAS. In the DATA step, use the RAND function within a loop to generate the data. Use the PROPPLOT statement in the UNIVARIATE procedure to construct the probability plot:
data Random;
N = 100; /* sample size */
family = "Normal"; /* or "Lognormal" */
mean = 8;
stdDev = 3;
do i = 1 to N;
x = rand(family, mean, stdDev);
output;
end;
run;
/* draw propability plot */
proc univariate data=Random;
var x;
probplot x / square normal; /* or lognormal; optionally specify popluation parameters */
run;
Thanks. How come the probplot does not represent a bell curve but more like a 45 degree straight line?
The probplot isn't plotting the distribution of x, but a probability plot, similar to a Q-Q plot. This is from the doc:
The PROBPLOT statement creates a probability plot, which compares ordered variable values with the percentiles of a specified theoretical distribution. If the data distribution matches the theoretical distribution, the points on the plot form a linear pattern. Consequently, you can use a probability plot to determine how well a theoretical distribution models a set of measurements.
Thanks. How to get the plot of the distribution? I am looking for something like xyplot(x, p);
proc univariate distribution plot
Base SAS(R) 9.2 Procedures Guide: Statistical Procedures, Third Edition
You can get a histogram of the simulated sample by using the HISTOGRAM statement in PROC UNIVARIATE. To overlay a line plot of the distribution, specify the distribution name and any known parameters after a "slash" (/) :
histogram x / normal; /* or lognormal; optionally specify popluation parameters */
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.