## Sample size effect on variance - create data

# Sample size effect on variance - create data

Good day my friends:

Based in laboratory research at 1, 12, 24 36 and 72 hours it has been collected 3 observations each hour:

data have;
input hour repetition concentration;
cards;
1         1       11.21
1         2       12.15
1         3       9.91
12       1       11.21
12       2       10.28
12       3       11.82
24       1       10.28
24       2       10.28
24       3       12.61
36       1       11.21
36       2       11.21
36       3       11.71
72       1       10.28
72       2       11.21
72       3       12.73

;

as you can see, in each hour can be obtained the mean ans standar deviation. For teaching purposes i need to increase the sample size to 10, 50, 100, 1000 and 10 000 repetitions, each sample size in diferent data sets. Thus, i need to create a new data set using aleatory data with the mean ans standar deviation from the original data set.

Thankyou very much.

## Re: Sample size effect on variance - create data

Then if I understand your question correclty you're looking for something like the following:

``````data have;
input hour repetition concentration;
cards;
1         1       11.21
1         2       12.15
1         3       9.91
12       1       11.21
12       2       10.28
12       3       11.82
24       1       10.28
24       2       10.28
24       3       12.61
36       1       11.21
36       2       11.21
36       3       11.71
72       1       10.28
72       2       11.21
72       3       12.73
;

proc means data=have noprint nway;
class hour;
var concentration;
output out=stats mean=p1 std=p2;
run;

%let n_records = 1000;

data simulated;
call streaminit(24);
set stats;
do rep = 1 to &n_records;
conc_sim = rand('normal', p1, p2);
conc_sim = round(conc_sim, 0.01);
output;
end;
run;

proc means data=simulated noprint nway;
class hour p1 p2;
var conc_sim;
output out=check mean=p1_sim std=p2_sim;
run;

``````

I used the old RNG though, you can look into the new ones and see if you want to use one of those instead:

https://blogs.sas.com/content/iml/2018/01/29/random-number-generators-sas.html

## Re: Sample size effect on variance - create data

Do you know what the distribution of the data should follow? Because it's concentration it likely cannot go below zero, so a normal distribution may not be appropriate. Once you know the distribution you can use the mean/std to simulate your data.

If you don't have enough data to determine the distribution, sometimes literature in your field will have the type of distribution and you can assume that.

## Re: Sample size effect on variance - create data

The data we need to simulate is normally distributed

thank you
## Re: Sample size effect on variance - create data

Then if I understand your question correclty you're looking for something like the following:

``````data have;
input hour repetition concentration;
cards;
1         1       11.21
1         2       12.15
1         3       9.91
12       1       11.21
12       2       10.28
12       3       11.82
24       1       10.28
24       2       10.28
24       3       12.61
36       1       11.21
36       2       11.21
36       3       11.71
72       1       10.28
72       2       11.21
72       3       12.73
;

proc means data=have noprint nway;
class hour;
var concentration;
output out=stats mean=p1 std=p2;
run;

%let n_records = 1000;

data simulated;
call streaminit(24);
set stats;
do rep = 1 to &n_records;
conc_sim = rand('normal', p1, p2);
conc_sim = round(conc_sim, 0.01);
output;
end;
run;

proc means data=simulated noprint nway;
class hour p1 p2;
var conc_sim;
output out=check mean=p1_sim std=p2_sim;
run;

``````

I used the old RNG though, you can look into the new ones and see if you want to use one of those instead:

https://blogs.sas.com/content/iml/2018/01/29/random-number-generators-sas.html

