Quartz | Level 8

## Simulating claims data using Poisson and Lognormal distributions

Hi,

I am trying to simulate claim amounts and claim numbers over 10,000 simulations. I can figure out how to determine the number of claims but I want to then use that information to simulate the amount of each claim.

The code below simulates the number of claims for 10,000 observations. However, I want to go a step further and to simulate claim amounts for each similation. For example, if the Poisson generates 212 for the first observation I want to produce 212 claim amounts with a lognormal distribution across the columns? My lognormal parameters are mu = 9.1 and shape = 1.5.m

data poisson(keep = x);
call streaminit (4321);
lambda = 212;
do i = 1 to 10000;
x = rand("Poisson",lambda);
output;
end;
run;

The resulting dataset would be a 10000 rows by approximately 300 colums.

I would also like to impose caps on the severity but I think I can do this.

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Opal | Level 21

## Re: Simulating claims data using Poisson and Lognormal distributions

Sure. But, in the long run, a wide data structure like the one you propose, is the wrong way to go. Anyway, this is how to get it:

``````data poisson(keep = simNo nbClaims claimNo amount);
call streaminit (4321);
lambda = 212;
mu = 9.1;
shape = 1.5;
do simNo = 1 to 10000;
nbClaims = rand("Poisson",lambda);
do claimNo = 1 to nbClaims;
amount = rand("lognormal", mu, shape);
output;
end;
end;
run;

proc transpose data=poisson out=claims(drop=_name_) prefix=claim_;
by simNo nbClaims;
id claimNo;
var amount;
run;``````
PG
8 REPLIES 8
Opal | Level 21

## Re: Simulating claims data using Poisson and Lognormal distributions

A long data format will be a lot simpler to use for almost any data manipulation or analysis, start with this:

``````data poisson(keep = simNo claimNo amount);
call streaminit (4321);
lambda = 212;
mu = 9.1;
shape = 1.5;
do simNo = 1 to 10000;
nbClaims = rand("Poisson",lambda);
do claimNo = 1 to nbClaims;
amount = rand("lognormal", mu, shape);
output;
end;
end;
run;``````
PG
Quartz | Level 8

## Re: Simulating claims data using Poisson and Lognormal distributions

Many thanks for this PG, is there any easy way to have the simulations as rows and the claims for each simulation as columns.

so the final dataset would be 10,000 by 250(or thereabouts).

Opal | Level 21

## Re: Simulating claims data using Poisson and Lognormal distributions

Sure. But, in the long run, a wide data structure like the one you propose, is the wrong way to go. Anyway, this is how to get it:

``````data poisson(keep = simNo nbClaims claimNo amount);
call streaminit (4321);
lambda = 212;
mu = 9.1;
shape = 1.5;
do simNo = 1 to 10000;
nbClaims = rand("Poisson",lambda);
do claimNo = 1 to nbClaims;
amount = rand("lognormal", mu, shape);
output;
end;
end;
run;

proc transpose data=poisson out=claims(drop=_name_) prefix=claim_;
by simNo nbClaims;
id claimNo;
var amount;
run;``````
PG
Quartz | Level 8

## Re: Simulating claims data using Poisson and Lognormal distributions

Thanks PG, why is the approach the wrong way to go? Just to give some background on what I’m doing... I’m trying to simulate large claims data for a policy. The policy has an expected number of claims of 212 per year and I’m trying to determine average cost of claims in excess of €1m. Once I simulate the above I will imposte a condition on the amount like “if amount < 1000000 then net = 0; else net = amount - 1000000;” I will then sum up the net amount over all claims in each simulation and then average that amount over the 10000 simulation. This will give me the expected claims cost in excess of €1m based on the assumption I’ve used.
SAS Super FREQ

## Re: Simulating claims data using Poisson and Lognormal distributions

The long data format is preferred because it enables you to efficiently process each simulated sample by using BY-group processing in procedures such as PROC MEANS or in the DATA step. See "Simulation in SAS: The slow way or the BY way"

Opal | Level 21

## Re: Simulating claims data using Poisson and Lognormal distributions

See how it could be done with a long data format:

``````data poisson(keep = simNo nbClaims claimNo amount);
call streaminit (4321);
lambda = 212;
mu = 9.1;
shape = 1.5;
do simNo = 1 to 10000;
nbClaims = rand("Poisson",lambda);
do claimNo = 1 to nbClaims;
amount = rand("lognormal", mu, shape);
output;
end;
end;
run;

proc sql;
create table exClaims as
select
simNo,
mean(amount) as meanClaim,
sum( case when amount > 1e6 then amount - 1e6 else 0 end ) as netClaims
from poisson
group by simNo;
quit;

proc univariate data=exClaims;
var meanClaim netClaims;
histogram;
format meanClaim netClaims e7.2;
run;``````
PG
Quartz | Level 8

## Re: Simulating claims data using Poisson and Lognormal distributions

Many thanks for your help, PG.

Quartz | Level 8

## Re: Simulating claims data using Poisson and Lognormal distributions

Thanks Rick, that makes sense.
Discussion stats
• 8 replies
• 1911 views
• 6 likes
• 3 in conversation