DATA Step, Macro, Functions and more

Sample size effect on variance - create data

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 115
Accepted Solution

Sample size effect on variance - create data

Good day my friends:

 

Based in laboratory research at 1, 12, 24 36 and 72 hours it has been collected 3 observations each hour:

 

data have;
input hour repetition concentration;
cards;
1         1       11.21
1         2       12.15
1         3       9.91
12       1       11.21
12       2       10.28
12       3       11.82
24       1       10.28
24       2       10.28
24       3       12.61
36       1       11.21
36       2       11.21
36       3       11.71
72       1       10.28
72       2       11.21
72       3       12.73

;

 

as you can see, in each hour can be obtained the mean ans standar deviation. For teaching purposes i need to increase the sample size to 10, 50, 100, 1000 and 10 000 repetitions, each sample size in diferent data sets. Thus, i need to create a new data set using aleatory data with the mean ans standar deviation from the original data set.

 

Thankyou very much.


Accepted Solutions
Solution
‎03-01-2018 11:44 AM
Super User
Posts: 22,823

Re: Sample size effect on variance - create data

Posted in reply to jonatan_velarde

Then if I understand your question correclty you're looking for something like the following:

 

data have;
input hour repetition concentration;
cards;
1         1       11.21 
1         2       12.15 
1         3       9.91 
12       1       11.21 
12       2       10.28 
12       3       11.82 
24       1       10.28 
24       2       10.28 
24       3       12.61 
36       1       11.21 
36       2       11.21 
36       3       11.71 
72       1       10.28 
72       2       11.21 
72       3       12.73
;

proc means data=have noprint nway;
class hour;
var concentration;
output out=stats mean=p1 std=p2;
run;

%let n_records = 1000;

data simulated;
call streaminit(24);
set stats;
do rep = 1 to &n_records;
conc_sim = rand('normal', p1, p2);
conc_sim = round(conc_sim, 0.01);
output;
end;
run;

proc means data=simulated noprint nway;
class hour p1 p2;
var conc_sim;
output out=check mean=p1_sim std=p2_sim;
run;


I used the old RNG though, you can look into the new ones and see if you want to use one of those instead:

https://blogs.sas.com/content/iml/2018/01/29/random-number-generators-sas.html

 

 

View solution in original post


All Replies
Super User
Posts: 22,823

Re: Sample size effect on variance - create data

Posted in reply to jonatan_velarde

Do you know what the distribution of the data should follow? Because it's concentration it likely cannot go below zero, so a normal distribution may not be appropriate. Once you know the distribution you can use the mean/std to simulate your data. 

 

If you don't have enough data to determine the distribution, sometimes literature in your field will have the type of distribution and you can assume that. 

 

 

Frequent Contributor
Posts: 115

Re: Sample size effect on variance - create data

Thank you for the answer:

The data we need to simulate is normally distributed

thank you
Solution
‎03-01-2018 11:44 AM
Super User
Posts: 22,823

Re: Sample size effect on variance - create data

Posted in reply to jonatan_velarde

Then if I understand your question correclty you're looking for something like the following:

 

data have;
input hour repetition concentration;
cards;
1         1       11.21 
1         2       12.15 
1         3       9.91 
12       1       11.21 
12       2       10.28 
12       3       11.82 
24       1       10.28 
24       2       10.28 
24       3       12.61 
36       1       11.21 
36       2       11.21 
36       3       11.71 
72       1       10.28 
72       2       11.21 
72       3       12.73
;

proc means data=have noprint nway;
class hour;
var concentration;
output out=stats mean=p1 std=p2;
run;

%let n_records = 1000;

data simulated;
call streaminit(24);
set stats;
do rep = 1 to &n_records;
conc_sim = rand('normal', p1, p2);
conc_sim = round(conc_sim, 0.01);
output;
end;
run;

proc means data=simulated noprint nway;
class hour p1 p2;
var conc_sim;
output out=check mean=p1_sim std=p2_sim;
run;


I used the old RNG though, you can look into the new ones and see if you want to use one of those instead:

https://blogs.sas.com/content/iml/2018/01/29/random-number-generators-sas.html

 

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 86 views
  • 0 likes
  • 2 in conversation