Good day my friends:
Based in laboratory research at 1, 12, 24 36 and 72 hours it has been collected 3 observations each hour:
data have;
input hour repetition concentration;
cards;
1 1 11.21
1 2 12.15
1 3 9.91
12 1 11.21
12 2 10.28
12 3 11.82
24 1 10.28
24 2 10.28
24 3 12.61
36 1 11.21
36 2 11.21
36 3 11.71
72 1 10.28
72 2 11.21
72 3 12.73
;
as you can see, in each hour can be obtained the mean ans standar deviation. For teaching purposes i need to increase the sample size to 10, 50, 100, 1000 and 10 000 repetitions, each sample size in diferent data sets. Thus, i need to create a new data set using aleatory data with the mean ans standar deviation from the original data set.
Thankyou very much.
Then if I understand your question correclty you're looking for something like the following:
data have;
input hour repetition concentration;
cards;
1 1 11.21
1 2 12.15
1 3 9.91
12 1 11.21
12 2 10.28
12 3 11.82
24 1 10.28
24 2 10.28
24 3 12.61
36 1 11.21
36 2 11.21
36 3 11.71
72 1 10.28
72 2 11.21
72 3 12.73
;
proc means data=have noprint nway;
class hour;
var concentration;
output out=stats mean=p1 std=p2;
run;
%let n_records = 1000;
data simulated;
call streaminit(24);
set stats;
do rep = 1 to &n_records;
conc_sim = rand('normal', p1, p2);
conc_sim = round(conc_sim, 0.01);
output;
end;
run;
proc means data=simulated noprint nway;
class hour p1 p2;
var conc_sim;
output out=check mean=p1_sim std=p2_sim;
run;
I used the old RNG though, you can look into the new ones and see if you want to use one of those instead:
https://blogs.sas.com/content/iml/2018/01/29/random-number-generators-sas.html
Do you know what the distribution of the data should follow? Because it's concentration it likely cannot go below zero, so a normal distribution may not be appropriate. Once you know the distribution you can use the mean/std to simulate your data.
If you don't have enough data to determine the distribution, sometimes literature in your field will have the type of distribution and you can assume that.
Then if I understand your question correclty you're looking for something like the following:
data have;
input hour repetition concentration;
cards;
1 1 11.21
1 2 12.15
1 3 9.91
12 1 11.21
12 2 10.28
12 3 11.82
24 1 10.28
24 2 10.28
24 3 12.61
36 1 11.21
36 2 11.21
36 3 11.71
72 1 10.28
72 2 11.21
72 3 12.73
;
proc means data=have noprint nway;
class hour;
var concentration;
output out=stats mean=p1 std=p2;
run;
%let n_records = 1000;
data simulated;
call streaminit(24);
set stats;
do rep = 1 to &n_records;
conc_sim = rand('normal', p1, p2);
conc_sim = round(conc_sim, 0.01);
output;
end;
run;
proc means data=simulated noprint nway;
class hour p1 p2;
var conc_sim;
output out=check mean=p1_sim std=p2_sim;
run;
I used the old RNG though, you can look into the new ones and see if you want to use one of those instead:
https://blogs.sas.com/content/iml/2018/01/29/random-number-generators-sas.html
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.