Solved: Re: Rand function

sas_user_1001 · Posted 08-21-2024 04:38 PM

I am using the rand function to pull simulated values from a distribution. Right now, I have it set as rand('Normal', Mu, Sigma); however, I was wondering if there was a way to specify a distribution and input its mean, std. dev., skew, and kurtosis. This way the distribution from which I simulate the data resembles its empirical distribution. Thanks.

Rick_SAS · Posted 08-22-2024 06:47 AM

Yes, there are several ways to do this. Some researchers use the moment-ratio diagram to find a distribution that is close to the (skewness, kurtosis) value, then sample from that distribution. See The moment-ratio diagram - The DO Loop (sas.com) A chapter of the book Simulating Data with SAS shows how to implement this idea by using SAS 9.4.

If you have SAS Viya, you can use PROC SIMSYSTEM, which enables you to specify the moments and will output the simulated samples.

If you have actual data, I suggest you model the data by using the Johnson distribution, which is conceptually the easiest flexible system. You can use PROC UNIVARIATE to perform the fit. If your data are bounded (for example, tests scores that are between 0 and 100), use the Johnson SB distribution. See The Johnson SB distribution - The DO Loop (sas.com)

If your data are unbounded, use the Johnson SU system. See The Johnson SU distribution - The DO Loop (sas.com)

(There is also an algorithm for deciding between the SB and SU family; see The Johnson system: Which distribution should you choose to model data? - The DO Loop (sas.com)) After you have decided on a system and fit the parameters to the data, you can use the DATA step programs in those articles to produce random samples from the model.

View solution in original post

PaigeMiller · Posted 08-21-2024 05:50 PM

There are no random number functions that allow you specify mean, variance, skew and kurtosis.

You could sample from the actual distribution to get something that resembles the distribution with whatever its mean, variance, skew and kurtosis are.

--
Paige Miller

Ksharp · Posted 08-21-2024 10:32 PM

Check @Rick_SAS blogs:
https://blogs.sas.com/content/iml/2024/04/15/beta-skewness-kurtosis.html

Rick_SAS · Posted 08-22-2024 06:47 AM

Yes, there are several ways to do this. Some researchers use the moment-ratio diagram to find a distribution that is close to the (skewness, kurtosis) value, then sample from that distribution. See The moment-ratio diagram - The DO Loop (sas.com) A chapter of the book Simulating Data with SAS shows how to implement this idea by using SAS 9.4.

If you have SAS Viya, you can use PROC SIMSYSTEM, which enables you to specify the moments and will output the simulated samples.

If you have actual data, I suggest you model the data by using the Johnson distribution, which is conceptually the easiest flexible system. You can use PROC UNIVARIATE to perform the fit. If your data are bounded (for example, tests scores that are between 0 and 100), use the Johnson SB distribution. See The Johnson SB distribution - The DO Loop (sas.com)

If your data are unbounded, use the Johnson SU system. See The Johnson SU distribution - The DO Loop (sas.com)

(There is also an algorithm for deciding between the SB and SU family; see The Johnson system: Which distribution should you choose to model data? - The DO Loop (sas.com)) After you have decided on a system and fit the parameters to the data, you can use the DATA step programs in those articles to produce random samples from the model.

sas_user_1001 · Posted 08-22-2024 06:17 PM

Wonderful--thanks for the information!

sas_user_1001 · Posted 09-17-2024 05:58 PM

As a follow-up, once you have the location, scale, and shape parameters of the distribution from using proc univariate, is there a way to assign them to a variable name in the same way we can do it with SAS summary statistics (e.g., max = name_for_max)? Or do I have to assign the value manually in the code (e.g., theta = 1.12345)? Ideally, I would like the code to read: theta = name_for_theta, and call up the variable name when needed using &name_for_theta.

Thanks for your assistance here, it has helped immensely!

Rick_SAS · Posted 09-18-2024 04:22 AM

once you have the location, scale, and shape parameters of the distribution from using proc univariate, is there a way to assign them to a variable name in the same way we can do it with SAS summary statistics (e.g., max = name_for_max)? Or do I have to assign the value manually in the code (e.g., theta = 1.12345)?

What "code" are you referring to? Show us your syntax.

The doc for the HISTOGRAM statement in PROC UNIVARIATE specifies that the syntax for the options to the parametric density keywords are of the form

THETA=value-list

SIGMA=value-list

so the option requires specifying a list of values, not a variable name.

You could use CALL SYMPUTX in a DATA _NULL_ step to assign values in a data set to macro variables. But that trick doesn't work if you are using BY-group proccessing.

sas_user_1001 · Posted 09-18-2024 02:35 PM

Sorry, I should have included the code instead of having you guess at what I'm doing...

proc univariate data = my_data;
var x_var;
histogram x_var / SB(theta = &x_var_Min_rounddn, sigma = &x_var__Max_roundup, fitmethod = moments)
endpoints = (&x_var_Min_rounddn to &x_var_Max_roundup by 1);

output out = temp_file_1
mean = x_var_Mean
std = x_var_Std
delta = dist_delta
gamma = dist_gamma;
quit;

data _null_;
set temp_file_1;
call symput('x_var_Mean', x_var_Mean);
call symput('x_var_Std', x_var_Std);
call symput('dist_delta', dist_delta);
call symput('dist_gamma', dist_gamma);

run;

What I was hoping to do was capture these parameter values I have bolded so I can call them up later as &dist_delta and &dist_gamma. I don't think this is possible, but you know way more than I do about SAS.

Rick_SAS · Posted 09-18-2024 03:06 PM

Yes.

Assuming the method of moments converges, PROC UNIVARIATE will create the ParameterEstimates table. You can write any SAS table to a data set by using the ODS OUTPUT statement. See ODS OUTPUT: Store any statistic created by any SAS procedure - The DO Loop


/* Johnson SB(threshold=theta, scale=sigma, shape=gamma, shape=delta) */
data SB(keep= X);
call streaminit(1);
do i = 1 to 1000;
   x = rand("Lognormal", 0, 0.4);
   output;
end;
run; 

 /* set bins: https://blogs.sas.com/content/iml/2014/08/25/bins-for-histograms.html */
proc univariate data=SB;
   histogram X / SB(theta=0 sigma=145 fitmethod=moments);
   ods output ParameterEstimates = PE;
   output out = temp_file_1  mean = x_var_Mean std = x_var_Std;
run;


data _null_;
   set temp_file_1;
   call symput('x_var_Mean', x_var_Mean);
   call symput('x_var_Std', x_var_Std);
run;

data _null_;
   set PE;
   if symbol='Delta' then
      call symput('dist_delta', Estimate);
   if symbol='Gamma' then
      call symput('dist_gamma', Estimate);
run;

%put &=x_var_Mean;
%put &=x_var_Std;
%put &=dist_delta;
%put &=dist_gamma;

sas_user_1001 · Posted 09-18-2024 04:08 PM

Oh, this is a wonderful solution! Thanks so much for the help!

Simulate data from a distribution with a specified mean, std. dev., skew, and kurtosis

Re: Rand function

Re: Rand function

Re: Rand function

Re: Rand function

Re: Rand function

Re: Rand function

Re: Rand function

Re: Rand function

Re: Rand function

Re: Rand function

Catch up on SAS Innovate 2026