BookmarkSubscribeRSS Feed
SR79
Fluorite | Level 6

Hi, 

 

I need to simulate data from a lognormal distribution, I have a known mean m and the coef of var CV values, so I have first done the code below (including the do loop to generate for 20 subjects) following Simulate lognormal data with specified mean and variance - The DO Loop (sas.com)

m = 0.5;
CV=1.20;
v=sqrt(log((CV)**2)+1); 
mu = log(m**2/(sqrt(v + m**2)));
sigma = sqrt(log((sqrt(v + m**2))**2/m**2));
x = rand('normal', mu, sigma);
y = exp(x);

 

Then I used lognormal option in the rand function: 

m = 0.5;
CV=1.20;
v=sqrt(log((CV)**2)+1); 
x = rand('lognormal', m, v);

 

Does anyone have a recommendation on which approach is preferrable?

 

Thanks!!

2 REPLIES 2
sbxkoenk
SAS Super FREQ

Hello,

 

I think both approaches are OK.

Plot the distributions on top of each other (overlay graph) to confirm you simulated correctly twice.

 

But do you need 2 approaches? Or do you encounter a problem with one of the approaches?

Knowing that the coefficient of variation (CV) is the ratio of the standard deviation to the mean , allows you to calculate the variance (square of the stddev) and then you can copy / paste the (code in the) blog of Rick Wicklin.

Good luck,
Koen

FreelanceReinh
Jade | Level 19

Hi @SR79,


@SR79 wrote:

m = 0.5;
CV=1.20;
v=sqrt(log((CV)**2)+1); 
mu = log(m**2/(sqrt(v + m**2)));
sigma = sqrt(log((sqrt(v + m**2))**2/m**2));
x = rand('normal', mu, sigma);
y = exp(x);

 

Then I used lognormal option in the rand function: 

m = 0.5;
CV=1.20;
v=sqrt(log((CV)**2)+1); 
x = rand('lognormal', m, v);


I think neither of the two approaches is correct.

 

Your formulas for mu and sigma are equivalent to those in Rick Wicklin's blog article. But there, v is the variance of the lognormal distribution. Where does your formula for v come from? I think it should be

v=(m*CV)**2;

Your formula for v resembles (but is not equivalent to) the expression for sigma in terms of CV:

sigma = sqrt(log(CV**2 + 1));

Your second approach uses the mean of the lognormal distribution as the first parameter of the RAND('lognormal', ...) function. But according to the documentation the first parameter is the mean of the normal distribution obtained by applying the logarithm to the lognormally distributed random variable.

 

I suggest that you use the above formula for sigma and

mu = log(m/sqrt(CV**2 + 1));

Then you can define

y=rand('lognormal', mu, sigma);

Equivalently, you could go via the normal distribution (as you did in your first approach), but it appears that SAS does just that internally when using the lognormal distribution in the RAND function (so you save the explicit transformation y=exp(x) if you use the lognormal distribution). My SAS 9.4M5 (using the same random seed, of course) produced exactly identical random number streams either way (and PROC UNIVARIATE confirmed that the empirical mean and coefficient of variation were close to the specified values 0.5 and 1.20, resp., when the sample size was large enough).

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1041 views
  • 4 likes
  • 3 in conversation