<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Store AIC values from a Monte Carlo Simulation in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Store-AIC-values-from-a-Monte-Carlo-Simulation/m-p/788912#M38668</link>
    <description>&lt;P&gt;I have the following simulated data which I create:&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;PRE&gt;%macro monteCarloSimulation();

	%let covariates=300; /* Number of covariates (independent variables) */

	%do mcno=1 %to 100;   /* Number of simulated datasets = 100 */
		data logit_data;
		drop i j;
		array x{&amp;amp;covariates.} x1-x&amp;amp;covariates.;
		do i=1 to 1000;
		do j=1 to &amp;amp;covariates.;
		x{j}=ranuni(1);
		end;
		linpred=2+10*x17-8*x5+3*x2+7*x6-5*x3-12*x30+11*x130-12*x200+rand("NORMal");
		prob = exp(linpred)/ (1 + exp(linpred));
		y = (prob &amp;gt; 0.5);
		output;
		end;
		drop prob linpred;
		run;

		/* Here I would like to run stepwise forward regression
		and stepwise backward regression and store the corresponding AIC 
		values to produce the table referenced below. 
		This should be done for each table that i produce in the simulation 
	        Note that 100 simulated tables are produced    */


	%end;

%mend monteCarloSimulation;

%monteCarloSimulation() &lt;/PRE&gt;
&lt;P&gt;From that simulated data, I would like for each simulated dataset to calculate:&amp;nbsp;&lt;BR /&gt;- AIC from a stepwise forward regression.&amp;nbsp;&lt;BR /&gt;- AIC from a stepwise backward regression.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;- If possible (I will read up on this later) AIC from a Lasso regression.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;And then finally store the AIC values in a table of the format:&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;TABLE width="629"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="96"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD width="230"&gt;AIC_Forward_Stepwise_Regression&lt;/TD&gt;
&lt;TD width="239"&gt;AIC_Backward_Stepwise_Regression&lt;/TD&gt;
&lt;TD width="64"&gt;AIC_Lasso&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Simulation1&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;SImulation2&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;.&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;.&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;.&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Simulation100&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&lt;BR /&gt;Ideally, I would also like to finally produce some summary statistics for evaluating which model-selection scheme performs best:&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;TABLE width="629"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="96"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD width="230"&gt;Forward_Stepwise_Regression&lt;/TD&gt;
&lt;TD width="239"&gt;Backward_Stepwise_Regression&lt;/TD&gt;
&lt;TD width="64"&gt;Lasso&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Mean AIC&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;STD&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Median AIC&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;25% quantile&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;75% quantile&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&lt;BR /&gt;This would be easily done in other programming languages and I guess so in SAS aswell, but are not used to doing statistical analysis in SAS (yet).&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;All help appreciated.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 07 Jan 2022 16:21:48 GMT</pubDate>
    <dc:creator>SasStatistics</dc:creator>
    <dc:date>2022-01-07T16:21:48Z</dc:date>
    <item>
      <title>Store AIC values from a Monte Carlo Simulation</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Store-AIC-values-from-a-Monte-Carlo-Simulation/m-p/788912#M38668</link>
      <description>&lt;P&gt;I have the following simulated data which I create:&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;PRE&gt;%macro monteCarloSimulation();

	%let covariates=300; /* Number of covariates (independent variables) */

	%do mcno=1 %to 100;   /* Number of simulated datasets = 100 */
		data logit_data;
		drop i j;
		array x{&amp;amp;covariates.} x1-x&amp;amp;covariates.;
		do i=1 to 1000;
		do j=1 to &amp;amp;covariates.;
		x{j}=ranuni(1);
		end;
		linpred=2+10*x17-8*x5+3*x2+7*x6-5*x3-12*x30+11*x130-12*x200+rand("NORMal");
		prob = exp(linpred)/ (1 + exp(linpred));
		y = (prob &amp;gt; 0.5);
		output;
		end;
		drop prob linpred;
		run;

		/* Here I would like to run stepwise forward regression
		and stepwise backward regression and store the corresponding AIC 
		values to produce the table referenced below. 
		This should be done for each table that i produce in the simulation 
	        Note that 100 simulated tables are produced    */


	%end;

%mend monteCarloSimulation;

%monteCarloSimulation() &lt;/PRE&gt;
&lt;P&gt;From that simulated data, I would like for each simulated dataset to calculate:&amp;nbsp;&lt;BR /&gt;- AIC from a stepwise forward regression.&amp;nbsp;&lt;BR /&gt;- AIC from a stepwise backward regression.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;- If possible (I will read up on this later) AIC from a Lasso regression.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;And then finally store the AIC values in a table of the format:&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;TABLE width="629"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="96"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD width="230"&gt;AIC_Forward_Stepwise_Regression&lt;/TD&gt;
&lt;TD width="239"&gt;AIC_Backward_Stepwise_Regression&lt;/TD&gt;
&lt;TD width="64"&gt;AIC_Lasso&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Simulation1&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;SImulation2&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;.&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;.&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;.&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Simulation100&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&lt;BR /&gt;Ideally, I would also like to finally produce some summary statistics for evaluating which model-selection scheme performs best:&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;TABLE width="629"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="96"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD width="230"&gt;Forward_Stepwise_Regression&lt;/TD&gt;
&lt;TD width="239"&gt;Backward_Stepwise_Regression&lt;/TD&gt;
&lt;TD width="64"&gt;Lasso&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Mean AIC&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;STD&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Median AIC&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;25% quantile&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;75% quantile&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&lt;BR /&gt;This would be easily done in other programming languages and I guess so in SAS aswell, but are not used to doing statistical analysis in SAS (yet).&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;All help appreciated.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 07 Jan 2022 16:21:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Store-AIC-values-from-a-Monte-Carlo-Simulation/m-p/788912#M38668</guid>
      <dc:creator>SasStatistics</dc:creator>
      <dc:date>2022-01-07T16:21:48Z</dc:date>
    </item>
    <item>
      <title>Re: Store AIC values from a Monte Carlo Simulation</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Store-AIC-values-from-a-Monte-Carlo-Simulation/m-p/788917#M38669</link>
      <description>&lt;P&gt;It's not&amp;nbsp; clear to me what part of this process you are struggling with. Is it running regressions where you have the problem, or storing the AIC values, or creating the final table, or something else?&lt;/P&gt;</description>
      <pubDate>Fri, 07 Jan 2022 16:28:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Store-AIC-values-from-a-Monte-Carlo-Simulation/m-p/788917#M38669</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2022-01-07T16:28:00Z</dc:date>
    </item>
    <item>
      <title>Re: Store AIC values from a Monte Carlo Simulation</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Store-AIC-values-from-a-Monte-Carlo-Simulation/m-p/788918#M38670</link>
      <description>1. Running regression. &lt;BR /&gt;2. Store the AIC values. &lt;BR /&gt;3. Creating the final table. &lt;BR /&gt;&lt;BR /&gt;I am very unused to this in SAS.</description>
      <pubDate>Fri, 07 Jan 2022 16:30:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Store-AIC-values-from-a-Monte-Carlo-Simulation/m-p/788918#M38670</guid>
      <dc:creator>SasStatistics</dc:creator>
      <dc:date>2022-01-07T16:30:54Z</dc:date>
    </item>
    <item>
      <title>Re: Store AIC values from a Monte Carlo Simulation</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Store-AIC-values-from-a-Monte-Carlo-Simulation/m-p/788919#M38671</link>
      <description>&lt;P&gt;Step 1 in any macro writing process is to write working code with no macros and no macro variables, for one iteration. That's where you start. Show us that code that does stepwise regression on one iteration.&lt;/P&gt;</description>
      <pubDate>Fri, 07 Jan 2022 16:40:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Store-AIC-values-from-a-Monte-Carlo-Simulation/m-p/788919#M38671</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2022-01-07T16:40:36Z</dc:date>
    </item>
    <item>
      <title>Re: Store AIC values from a Monte Carlo Simulation</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Store-AIC-values-from-a-Monte-Carlo-Simulation/m-p/788932#M38672</link>
      <description>&lt;P&gt;In addition to my above comments,&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13684"&gt;@Rick_SAS&lt;/a&gt;&amp;nbsp;has written blogs about performing &lt;A href="https://blogs.sas.com/content/iml/2017/02/13/run-1000-regressions.html" target="_self"&gt;thousands of regressions&lt;/A&gt;, and no macros are needed. It's highly likely that this could be adapted to your Monte Carlo case (and again no macros needed). Or maybe even he has created a similar blog post for Monte Carlo simulations, but I'm sure there is no need for macros here.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Taking a further step back: I understand that the primary reason people run Monte Carlo simulations is to obtain estimates of variability for estimators that don't have a closed form formula for the variability of the estimator. In your case, you seem to be doing a Monte Carlo simulation for situations where you have 300 covariates which are uncorrelated with each other. This corresponds to exactly zero real-world data sets — you will never find a real-word data set where the covariates are uncorrelated (or even slightly correlated). Every real world data set I know of has certain correlations that are not close to zero, and some that are close to (or exactly equal to) ±1. So I question the value of such a Monte Carlo study; a more valuable study would be the case where the covariates have many correlations that are not near zero and possibly some that are near ±1. So my advice is to not do this particular Monte Carlo study as you have it set up, unless it is a homework assignment.&lt;/P&gt;</description>
      <pubDate>Fri, 07 Jan 2022 17:46:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Store-AIC-values-from-a-Monte-Carlo-Simulation/m-p/788932#M38672</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2022-01-07T17:46:02Z</dc:date>
    </item>
    <item>
      <title>Re: Store AIC values from a Monte Carlo Simulation</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Store-AIC-values-from-a-Monte-Carlo-Simulation/m-p/788938#M38673</link>
      <description>&lt;P&gt;The basic outline for this kind of simulation follows:&lt;/P&gt;
&lt;P&gt;1. If you know how to use the DATA step to simulate one sample of size N from a logistic model, then put a DO loop around the outside so that you generate B samples, each of size N.&lt;/P&gt;
&lt;P&gt;For an example of a linear model, see &lt;A href="https://blogs.sas.com/content/iml/2017/02/01/simulate-samples-linear-regression.html" target="_self"&gt;"Simulate many samples from a linear regression model."&lt;/A&gt;&amp;nbsp;&lt;A href="https://blogs.sas.com/content/iml/2014/06/25/simulate-logistic-data.html" target="_self"&gt;For a logistic model, see the ideas in this post&lt;/A&gt;, although the actual simulation in that post uses PROC IML.&lt;/P&gt;
&lt;P&gt;2. &lt;A href="https://blogs.sas.com/content/iml/2013/05/24/turn-off-ods-for-simulations.html" target="_self"&gt;Turn off ODS&lt;/A&gt; and &lt;A href="https://blogs.sas.com/content/iml/2012/07/18/simulation-in-sas-the-slow-way-or-the-by-way.html" target="_self"&gt;use a BY-group analysis to analyze all B samples&lt;/A&gt; by using one call to a procedure.&lt;/P&gt;
&lt;P&gt;3. Use PROC MEANS or UNIVARIATE to analyze the distribution of the statistic (such as AIC) that you are studying.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I would like to point out that your simulation from a logistic model is not correct. You put the "randomness" in the wrong location. Instead of&lt;/P&gt;
&lt;PRE&gt;linpred = &amp;lt;linear combination&amp;gt; + rand("NORMal");
prob = exp(linpred)/ (1 + exp(linpred));
y = (prob &amp;gt; 0.5);&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;the correct formula is&amp;nbsp;&lt;/P&gt;
&lt;PRE class="text"&gt;linpred = &amp;lt;linear combination&amp;gt;;     /* 2. linear model */
mu = logistic(eta);                 /* 3. transform by inverse logit */
y = rand("Bernoulli", mu);          /* 4. Simulate binary response */&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 07 Jan 2022 18:21:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Store-AIC-values-from-a-Monte-Carlo-Simulation/m-p/788938#M38673</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2022-01-07T18:21:57Z</dc:date>
    </item>
  </channel>
</rss>

