I want to generate numbers randomly from a csv table of 40,000 number. I've attached an sample of the table of numbers "sample.csv".
I want to then use the numbers generated to produce 100 iterations of the formula:
iteration(i) = sum(random number (j)) from j = 1 to x
So, the first cell in the table below will generate 15 numbers from the table sample.csv and then sum them together and output to the cell, the second iteration will do the same but it will be a new set of 15 random numbers from the table sample.csv, and so on.
In the second row, x is 144 so each iteration will sum 144 random numbers from the sample.csv table.
Here is the code to import the csv:
proc import datafile="C:\Users\mbrophy\Downloads\sample.csv"
out=mydata dbms=csv replace;
getnames=no;
run;
Hi @brophymj
Thank you for sharing your documents.
I have gone through the code and I have corrected the below elements:
- keep no_of_spin in the macro %null_table
- put out_&i._&j. in the macro %append
- update the SET statement in the last step to be sure that table will be append in the right order
It should work now.
Let me know
Best,
/*Create macro-variables*/
data _null_;
set analysis2_ind;
call symputx ("nb_obs"||left(_n_),no_of_spins); /* store each value of x in a macrovariable*/
call symputx ("nb_row",_n_); /* store the total number of rows in the dataset "have"*/
run;
/*Generate random numbers from sample data and create a table per draw with the sum:
name of the dataset = out_<sample size>_<iteration number> */
%macro random;
%do i=1 %to &nb_row;
%do j=1 %to 5;
proc surveyselect data=sample method=srs n=&&nb_obs&i out=out_&i._&j.;
run;
proc means data=out_&i._&j. sum;
output out=out_&i._&j. sum=it&j.;
run;
%end;
%end;
%mend;
%random;
/*Initialize empty tables */
%macro null_table;
%do i=1 %to &nb_row;
data analysis2_ind_&i.;
set analysis2_ind (keep=no_of_spins);
run;
%end;
%mend;
%null_table
/*Creation of one table per draw by retrieving all iterations for one sample size */
%macro append;
%do i=1 %to &nb_row;
%do j=1 %to 5;
data analysis2_ind_&i. ;
set analysis2_ind_&i. ;
set out_&i._&j. (drop=_type_ _freq_); /*<=== put out_&i._&j. and not out_&&nb_obs&i.._&j.*/
run;
%end;
%end;
%mend;
%append;
/*Append all draws */
data analysis2_compil (drop=no_of_spins);
set analysis2_ind_1-analysis2_ind_&nb_row;
run;
data want;
set analysis2_ind;
set analysis2_compil;
run;
Please attach the csv file, and post the code used to read it into SAS.
HI Kurt, I've done that now.
HI @brophymj
You can try the below code. Let me know if this works. Hope this help!
data have;
input x @@;
cards;
15 144 3 4 66 2 4 5 5
;
run;
data sample;/* <== put the csv file */
input r @@;
cards;
-2 -2 -2 -2 -2 -6 -2 -2 -2 -2
;
run;
/*Create macro-variables*/
data _null_;
set have;
call symputx ("nb_obs"||left(_n_),x); /* store each value of x in a macrovariable*/
call symputx ("nb_row",_n_); /* store the total number of rows in the dataset "have"*/
run;
/*Generate random numbers from sample data and create a table per draw with the sum:
name of the dataset = out_<sample size>_<iteration number> */
%macro random;
%do i=1 %to &nb_row;
%do j=1 %to 100;
proc surveyselect data=sample method=srs n=&&nb_obs&i out=out_&&nb_obs&i.._&j.;
run;
proc means data=out_&&nb_obs&i.._&j. sum;
output out=out_&&nb_obs&i.._&j. sum=it&j.;
run;
%end;
%end;
%mend;
%random;
/*Initialize empty tables */
%macro null_table;
%do i=1 %to &nb_row;
data have_&i.;
set have;
run;
%end;
%mend;
%null_table
/*Creation of one table per draw by retrieving all iterations for one sample size */
%macro append;
%do i=1 %to &nb_row;
%do j=1 %to 100;
data have_&i. ;
set have_&i. ;
set out_&&nb_obs&i.._&j. (drop=_type_ _freq_);
run;
%end;
%end;
%mend;
%append;
/*Append all draws */
data have_compil (drop=x);
set have_:;
run;
data want;
set have;
set have_compil;
run;
Thanks ed_sas_member
That works really well. The only small thing is that the random numbers generated are the same rows which have the same x. So, in the dataset "have", I'm getting the same number for the last two rows (where x = 5). Any ideas what's causing this? Thanks
Hi @brophymj
Thank you for your feedback!
I think you can replace out_&&nb_obs&i.._&j. by out_&i._&j. in the code as follows. In fact, in the previous code, the name of the datasets created for each draw was dependent on the x variable. So when you had multiple similar values for x, the datasets where erased.
Best,
data have;
input x @@;
cards;
15 144 3 4 66 2 4 5 5
;
run;
data sample;/* <== put the csv file */
input r @@;
cards;
-2 -2 -2 -2 -2 -6 -2 -2 -2 -2
;
run;
/*Create macro-variables*/
data _null_;
set have;
call symputx ("nb_obs"||left(_n_),x); /* store each value of x in a macrovariable*/
call symputx ("nb_row",_n_); /* store the total number of rows in the dataset "have"*/
run;
/*Generate random numbers from sample data and create a table per draw with the sum:
name of the dataset = out_<sample size>_<iteration number> */
%macro random;
%do i=1 %to &nb_row;
%do j=1 %to 100;
proc surveyselect data=sample method=srs n=&&nb_obs&i out=out_&i._&j.;
run;
proc means data=out_&i._&j. sum;
output out=out_&i._&j. sum=it&j.;
run;
%end;
%end;
%mend;
%random;
/*Initialize empty tables */
%macro null_table;
%do i=1 %to &nb_row;
data have_&i.;
set have;
run;
%end;
%mend;
%null_table
/*Creation of one table per draw by retrieving all iterations for one sample size */
%macro append;
%do i=1 %to &nb_row;
%do j=1 %to 100;
data have_&i. ;
set have_&i. ;
set out_&i._&j. (drop=_type_ _freq_);
run;
%end;
%end;
%mend;
%append;
/*Append all draws */
data have_compil (drop=x);
set have_:;
run;
data want;
set have;
set have_compil;
run;
Thanks Ed_sas_member
It's not working for me and it's probably because i didn't show you the files i'm using. I've attached the sas code and two input files. When i run it, the first row is repeated for all 18 observations. It seems it's almost there but I'm missing something.
Using the files you've attached here an approach which doesn't require macro coding. I've stored the data from windist.csv in a variable called number. If you call it the same and you call the source tables work.analysis2_ind and work.windist then below code should run for you without any changes required.
%let n_iter=100;
data want(drop=_:);
if _n_=1 then
do;
/* define hash for number list */
dcl hash numbers();
numbers.defineKey('_n');
numbers.defineData('_number');
numbers.defineDone();
/* load hash with number list */
do while(not last);
set windist(keep=number rename=(number=_number)) end=last;
_n+1;
numbers.add();
/* keep total sum in retained variable
used later for exception handling
*/
_tot_sum_num=sum(_tot_sum_num,_number);
retain _tot_sum_num;
end;
/* define hash to hold _n of already selected items */
dcl hash selected();
selected.defineKey('_n');
selected.defineDone();
end;
/* read master table */
set analysis2_ind;
array sum_num_ {&n_iter} 8;
/* &n_iter iterations */
do _iterInd=1 to dim(sum_num_);
selected.clear();
/* pick no_of_spins items from hash randomly without replacement */
if no_of_spins>numbers.num_items then
do;
/* exception handling */
put 'More items requested than available in number list ';
put 'All items will get selected for summarization';
sum_num_[_iterInd]=_tot_sum_num;
continue;
end;
do _spinInd=1 to no_of_spins;
/* generate random item number not yet used */
do while(1);
_n=RAND('INTEGER',numbers.num_items);
if selected.check() ne 0 then
do;
selected.add();
leave;
end;
end;
/* read new item from hash and sum */
numbers.find();
sum_num_[_iterInd]=sum(sum_num_[_iterInd], _number);
end;
end;
run;
proc print data=want;
var no_of_spins sum_num_:;
run;
I wasn't really sure where without replacement applies for randomly picking a number from your list. In the code above it's per cell summation but the same number could get picked for the sum in the next cell. Is that what you're after?
Hi @brophymj
Thank you for sharing your documents.
I have gone through the code and I have corrected the below elements:
- keep no_of_spin in the macro %null_table
- put out_&i._&j. in the macro %append
- update the SET statement in the last step to be sure that table will be append in the right order
It should work now.
Let me know
Best,
/*Create macro-variables*/
data _null_;
set analysis2_ind;
call symputx ("nb_obs"||left(_n_),no_of_spins); /* store each value of x in a macrovariable*/
call symputx ("nb_row",_n_); /* store the total number of rows in the dataset "have"*/
run;
/*Generate random numbers from sample data and create a table per draw with the sum:
name of the dataset = out_<sample size>_<iteration number> */
%macro random;
%do i=1 %to &nb_row;
%do j=1 %to 5;
proc surveyselect data=sample method=srs n=&&nb_obs&i out=out_&i._&j.;
run;
proc means data=out_&i._&j. sum;
output out=out_&i._&j. sum=it&j.;
run;
%end;
%end;
%mend;
%random;
/*Initialize empty tables */
%macro null_table;
%do i=1 %to &nb_row;
data analysis2_ind_&i.;
set analysis2_ind (keep=no_of_spins);
run;
%end;
%mend;
%null_table
/*Creation of one table per draw by retrieving all iterations for one sample size */
%macro append;
%do i=1 %to &nb_row;
%do j=1 %to 5;
data analysis2_ind_&i. ;
set analysis2_ind_&i. ;
set out_&i._&j. (drop=_type_ _freq_); /*<=== put out_&i._&j. and not out_&&nb_obs&i.._&j.*/
run;
%end;
%end;
%mend;
%append;
/*Append all draws */
data analysis2_compil (drop=no_of_spins);
set analysis2_ind_1-analysis2_ind_&nb_row;
run;
data want;
set analysis2_ind;
set analysis2_compil;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.