Simulating lognormal data

SR79 · Posted 06-05-2022 09:29 AM

Hi,

I need to simulate data from a lognormal distribution, I have a known mean m and the coef of var CV values, so I have first done the code below (including the do loop to generate for 20 subjects) following Simulate lognormal data with specified mean and variance - The DO Loop (sas.com):

m = 0.5;
CV=1.20;
v=sqrt(log((CV)**2)+1);
mu = log(m**2/(sqrt(v + m**2)));
sigma = sqrt(log((sqrt(v + m**2))**2/m**2));
x = rand('normal', mu, sigma);
y = exp(x);

Then I used lognormal option in the rand function:

m = 0.5;
CV=1.20;
v=sqrt(log((CV)**2)+1);
x = rand('lognormal', m, v);

Does anyone have a recommendation on which approach is preferrable?

Thanks!!

sbxkoenk · Posted 06-05-2022 11:47 AM

Hello,

I think both approaches are OK.

Plot the distributions on top of each other (overlay graph) to confirm you simulated correctly twice.

But do you need 2 approaches? Or do you encounter a problem with one of the approaches?

Knowing that the coefficient of variation (CV) is the ratio of the standard deviation to the mean , allows you to calculate the variance (square of the stddev) and then you can copy / paste the (code in the) blog of Rick Wicklin.

Good luck,
Koen

FreelanceReinh · Posted 06-06-2022 03:22 PM

Hi @SR79,

@SR79 wrote:

m = 0.5;
CV=1.20;
v=sqrt(log((CV)**2)+1);
mu = log(m**2/(sqrt(v + m**2)));
sigma = sqrt(log((sqrt(v + m**2))**2/m**2));
x = rand('normal', mu, sigma);
y = exp(x);

Then I used lognormal option in the rand function:

m = 0.5;
CV=1.20;
v=sqrt(log((CV)**2)+1);
x = rand('lognormal', m, v);

I think neither of the two approaches is correct.

Your formulas for mu and sigma are equivalent to those in Rick Wicklin's blog article. But there, v is the variance of the lognormal distribution. Where does your formula for v come from? I think it should be

v=(m*CV)**2;

Your formula for v resembles (but is not equivalent to) the expression for sigma in terms of CV:

sigma = sqrt(log(CV**2 + 1));

Your second approach uses the mean of the lognormal distribution as the first parameter of the RAND('lognormal', ...) function. But according to the documentation the first parameter is the mean of the normal distribution obtained by applying the logarithm to the lognormally distributed random variable.

I suggest that you use the above formula for sigma and

mu = log(m/sqrt(CV**2 + 1));

Then you can define

y=rand('lognormal', mu, sigma);

Equivalently, you could go via the normal distribution (as you did in your first approach), but it appears that SAS does just that internally when using the lognormal distribution in the RAND function (so you save the explicit transformation y=exp(x) if you use the lognormal distribution). My SAS 9.4M5 (using the same random seed, of course) produced exactly identical random number streams either way (and PROC UNIVARIATE confirmed that the empirical mean and coefficient of variation were close to the specified values 0.5 and 1.20, resp., when the sample size was large enough).

Simulating lognormal data

Re: Simulating lognormal data

Re: Simulating lognormal data

Simulating lognormal data

Re: Simulating lognormal data

Re: Simulating lognormal data

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away