Hi all,
I am a new user of SAS who has only worked with R previously. My goal is to generate many sets of large simulations. I have written some codes for running one set of simulations, which are shown below. My questions are:
1. What is the best way to repeat these codes for x number of times automatically, and
2. how to specify a range of inputs and outputs for each time the codes are excuted? (e.g. importing files named 'design_matrix_1.csv', 'design_matrix_2.csv', ...., 'design_matrix_10.csv', and exporting the results as 'result1_1.csv', ..., 'result1_10.csv', respectively)
One way I can think of is to create a macro, and manually specify the input and output files for each scenario.
%let N = 10000;
* import design matrix
PROC IMPORT DATAFILE='/folders/myfolders/design_matrix_1.csv' replace
OUT=design;
GETNAMES=YES;
RUN;
* extract information from design matrix
proc sql noprint;
select max(Clus) into : nclus from design;
quit;
* generate N copies of the design matrix
data design_rep;
do i = 1 to &N;
do j = 1 to n;
set design nobs=n point=j;
output;
end;
end;
stop;
run;
* generate random numbers to simulate datasets
data sim;
set design_rep ;
by i Clus;
retain clus_int;
if first.Clus then do;
clus_int = rand('normal', 0, sigma_clus_lp);
end;
error = rand("Normal", 0, &sigma);
y = Tx* &Tx_effect + period + clus_int + error;
run;
ods exclude all;
ods noresults;
* analyze data
PROC glimmix data=sim;
by i;
CLASS Clus ;
MODEL y = Tx Period /S DDFM=KR;
RANDOM intercept / subject = Clus ;
NLOPTIONS TECHNIQUE=DBLDOG;
ods output Tests3 = result1 ParameterEstimates = result2;
RUN;
ods exclude none;
ods results;
PROC EXPORT
DATA=result1
replace
OUTFILE="/folders/myfolders/result_1.csv"
DBMS=CSV;
PROC EXPORT
DATA=result2
replace
OUTFILE="/folders/myfolders/result_2.csv"
DBMS=CSV;
data reject (keep = Reject);
set result1;
where Effect = 'Tx';
Reject = (ProbF < 0.05);
run;
proc summary data = reject;
var Reject;
output out = power(drop = _FREQ_ _TYPE_ ) mean = ;
run;
PROC PRINT DATA=power;
run;
Many thanks!
Macros, Call Execute or DOBSUBL are the usual options.
https://stats.idre.ucla.edu/sas/seminars/sas-macros-introduction/
This method is pretty robust and helps prevent errors and makes it much easier to debug your code. Obviously biased, because I wrote it 🙂 https://github.com/statgeek/SAS-Tutorials/blob/master/Turning%20a%20program%20into%20a%20macro.md
Macro Appendix with examples:
@AnOrangeTree wrote:
Hi all,
I am a new user of SAS who has only worked with R previously. My goal is to generate many sets of large simulations. I have written some codes for running one set of simulations, which are shown below. My questions are:
1. What is the best way to repeat these codes for x number of times automatically, and
2. how to specify a range of inputs and outputs for each time the codes are excuted? (e.g. importing files named '
design_matrix_1.csv', 'design_matrix_2.csv', ...., 'design_matrix_10.csv', and exporting the results as 'result1_1.csv', ..., 'result1_10.csv', respectively)
One way I can think of is to create a macro, and manually specify the input and output files for each scenario.
%let N = 10000; * import design matrix PROC IMPORT DATAFILE='/folders/myfolders/design_matrix_1.csv' replace OUT=design; GETNAMES=YES; RUN; * extract information from design matrix proc sql noprint; select max(Clus) into : nclus from design; quit; * generate N copies of the design matrix data design_rep; do i = 1 to &N; do j = 1 to n; set design nobs=n point=j; output; end; end; stop; run; * generate random numbers to simulate datasets data sim; set design_rep ; by i Clus; retain clus_int; if first.Clus then do; clus_int = rand('normal', 0, sigma_clus_lp); end; error = rand("Normal", 0, &sigma); y = Tx* &Tx_effect + period + clus_int + error; run; ods exclude all; ods noresults; * analyze data PROC glimmix data=sim; by i; CLASS Clus ; MODEL y = Tx Period /S DDFM=KR; RANDOM intercept / subject = Clus ; NLOPTIONS TECHNIQUE=DBLDOG; ods output Tests3 = result1 ParameterEstimates = result2; RUN; ods exclude none; ods results; PROC EXPORT DATA=result1 replace OUTFILE="/folders/myfolders/result_1.csv" DBMS=CSV; PROC EXPORT DATA=result2 replace OUTFILE="/folders/myfolders/result_2.csv" DBMS=CSV; data reject (keep = Reject); set result1; where Effect = 'Tx'; Reject = (ProbF < 0.05); run; proc summary data = reject; var Reject; output out = power(drop = _FREQ_ _TYPE_ ) mean = ; run; PROC PRINT DATA=power; run;
Many thanks!
Thanks for sharing these resources! They're really helpful.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.