<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Dataset structure for regression in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Dataset-structure-for-regression/m-p/74407#M16030</link>
    <description>I would try to avoid 1000 data steps followed by 1000 calls to PROC REG.&lt;BR /&gt;
&lt;BR /&gt;
A data step view seems like a good choice and with a bit of help from the macro language to write it you can run one data step and one call to PROC REG.&lt;BR /&gt;
&lt;BR /&gt;
[pre]&lt;BR /&gt;
data A;&lt;BR /&gt;
   set sashelp.class;&lt;BR /&gt;
   run;&lt;BR /&gt;
&lt;BR /&gt;
*** You don't need this as you already have 1000 sim datas;&lt;BR /&gt;
%macro simdata(arg);&lt;BR /&gt;
   data &lt;BR /&gt;
   %do i = 1 %to &amp;amp;arg;&lt;BR /&gt;
      A&amp;amp;i&lt;BR /&gt;
      %end;&lt;BR /&gt;
   ;&lt;BR /&gt;
   set sashelp.class(obs=5);&lt;BR /&gt;
   run;&lt;BR /&gt;
   %mend simdata;&lt;BR /&gt;
options mprint=1;&lt;BR /&gt;
&lt;BR /&gt;
%simdata(1000);&lt;BR /&gt;
&lt;BR /&gt;
*** Combine each SIM data with a copy of A;&lt;BR /&gt;
%macro combine(arg);&lt;BR /&gt;
   data all / view=all;&lt;BR /&gt;
      set &lt;BR /&gt;
         %do i = 1 %to &amp;amp;arg;&lt;BR /&gt;
            A&amp;amp;i A&lt;BR /&gt;
            %end;&lt;BR /&gt;
         indsname=indsname open=defer;&lt;BR /&gt;
&lt;BR /&gt;
   retain simgroup;&lt;BR /&gt;
   from = indsname;&lt;BR /&gt;
   if indsname ne 'WORK.A' then simgroup = indsname;&lt;BR /&gt;
   run;&lt;BR /&gt;
   %mend combine;&lt;BR /&gt;
%combine(1000);&lt;BR /&gt;
&lt;BR /&gt;
proc print data=all(obs=100);&lt;BR /&gt;
   run;&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
proc reg data=all noprint outest=est;&lt;BR /&gt;
   by NOTSORTED simgroup;&lt;BR /&gt;
   model weight = age height;&lt;BR /&gt;
   run;&lt;BR /&gt;
   quit;&lt;BR /&gt;
[/pre]</description>
    <pubDate>Fri, 24 Sep 2010 03:57:42 GMT</pubDate>
    <dc:creator>data_null__</dc:creator>
    <dc:date>2010-09-24T03:57:42Z</dc:date>
    <item>
      <title>Dataset structure for regression</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dataset-structure-for-regression/m-p/74405#M16028</link>
      <description>I have a dataset block, I call it A. &lt;BR /&gt;
A is the actual data, which is large.&lt;BR /&gt;
&lt;BR /&gt;
Then I have A_1, A_2, ... A_3,....A_n which are simulated, smaller blocks of data  that have the same # of columns as A but a lot less rows than A. (n=1000, say).&lt;BR /&gt;
&lt;BR /&gt;
I want to regress using the dataset that has A_1 appended to A.&lt;BR /&gt;
Then, repeat with  A_2 appended to A, until A_n appended to A.&lt;BR /&gt;
&lt;BR /&gt;
Is there any good way to do this as efficiently as possible, without appending the data A_1,...., A_n to A through a loop data step and performing regression?&lt;BR /&gt;
&lt;BR /&gt;
Thank you.</description>
      <pubDate>Thu, 23 Sep 2010 20:22:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dataset-structure-for-regression/m-p/74405#M16028</guid>
      <dc:creator>SAS_user_n</dc:creator>
      <dc:date>2010-09-23T20:22:07Z</dc:date>
    </item>
    <item>
      <title>Re: Dataset structure for regression</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dataset-structure-for-regression/m-p/74406#M16029</link>
      <description>Not that I know of (using SAS for 27 years). I would just write a macro loop to append the data for each case and do any analysis. Something like:&lt;BR /&gt;
[pre]&lt;BR /&gt;
%macro loopy;&lt;BR /&gt;
    %do i=1 %to 1000;&lt;BR /&gt;
        data Aplus;&lt;BR /&gt;
            set A A_&amp;amp;i;&lt;BR /&gt;
        run;&lt;BR /&gt;
        proc reg data=Aplus;&lt;BR /&gt;
            *--- etc. ----;&lt;BR /&gt;
        run;&lt;BR /&gt;
    %end;&lt;BR /&gt;
%mend;&lt;BR /&gt;
[/pre]&lt;BR /&gt;
%loopy</description>
      <pubDate>Thu, 23 Sep 2010 22:34:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dataset-structure-for-regression/m-p/74406#M16029</guid>
      <dc:creator>WaltSmith</dc:creator>
      <dc:date>2010-09-23T22:34:42Z</dc:date>
    </item>
    <item>
      <title>Re: Dataset structure for regression</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dataset-structure-for-regression/m-p/74407#M16030</link>
      <description>I would try to avoid 1000 data steps followed by 1000 calls to PROC REG.&lt;BR /&gt;
&lt;BR /&gt;
A data step view seems like a good choice and with a bit of help from the macro language to write it you can run one data step and one call to PROC REG.&lt;BR /&gt;
&lt;BR /&gt;
[pre]&lt;BR /&gt;
data A;&lt;BR /&gt;
   set sashelp.class;&lt;BR /&gt;
   run;&lt;BR /&gt;
&lt;BR /&gt;
*** You don't need this as you already have 1000 sim datas;&lt;BR /&gt;
%macro simdata(arg);&lt;BR /&gt;
   data &lt;BR /&gt;
   %do i = 1 %to &amp;amp;arg;&lt;BR /&gt;
      A&amp;amp;i&lt;BR /&gt;
      %end;&lt;BR /&gt;
   ;&lt;BR /&gt;
   set sashelp.class(obs=5);&lt;BR /&gt;
   run;&lt;BR /&gt;
   %mend simdata;&lt;BR /&gt;
options mprint=1;&lt;BR /&gt;
&lt;BR /&gt;
%simdata(1000);&lt;BR /&gt;
&lt;BR /&gt;
*** Combine each SIM data with a copy of A;&lt;BR /&gt;
%macro combine(arg);&lt;BR /&gt;
   data all / view=all;&lt;BR /&gt;
      set &lt;BR /&gt;
         %do i = 1 %to &amp;amp;arg;&lt;BR /&gt;
            A&amp;amp;i A&lt;BR /&gt;
            %end;&lt;BR /&gt;
         indsname=indsname open=defer;&lt;BR /&gt;
&lt;BR /&gt;
   retain simgroup;&lt;BR /&gt;
   from = indsname;&lt;BR /&gt;
   if indsname ne 'WORK.A' then simgroup = indsname;&lt;BR /&gt;
   run;&lt;BR /&gt;
   %mend combine;&lt;BR /&gt;
%combine(1000);&lt;BR /&gt;
&lt;BR /&gt;
proc print data=all(obs=100);&lt;BR /&gt;
   run;&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
proc reg data=all noprint outest=est;&lt;BR /&gt;
   by NOTSORTED simgroup;&lt;BR /&gt;
   model weight = age height;&lt;BR /&gt;
   run;&lt;BR /&gt;
   quit;&lt;BR /&gt;
[/pre]</description>
      <pubDate>Fri, 24 Sep 2010 03:57:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dataset-structure-for-regression/m-p/74407#M16030</guid>
      <dc:creator>data_null__</dc:creator>
      <dc:date>2010-09-24T03:57:42Z</dc:date>
    </item>
    <item>
      <title>Re: Dataset structure for regression</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dataset-structure-for-regression/m-p/74408#M16031</link>
      <description>I absolutely agree with data _null_ - I would want to avoid 1000 datasetps &amp;amp; 1000 proc regs - there is almost always a better solution by restructuring the problem - one way being the one suggested - however, there are rare times when you just gotta bite the bullet and muscle through the 1000 or more steps.</description>
      <pubDate>Fri, 24 Sep 2010 17:11:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dataset-structure-for-regression/m-p/74408#M16031</guid>
      <dc:creator>WaltSmith</dc:creator>
      <dc:date>2010-09-24T17:11:46Z</dc:date>
    </item>
    <item>
      <title>Re: Dataset structure for regression</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dataset-structure-for-regression/m-p/74409#M16032</link>
      <description>I agree too.  &lt;BR /&gt;
&lt;BR /&gt;
I think the program will scale ok using OPEN=DEFER should allow the data step to concatenate the data sets efficiently.&lt;BR /&gt;
&lt;BR /&gt;
For my similated 1000 data sets I was surprised that I could create 1000 data sets in a single step.  If they had more variables it might be a problem.&lt;BR /&gt;
&lt;BR /&gt;
Even if you could not concatenate 1000 data sets in one step the problem could probably be broken up to smaller say 100 data set groups.&lt;BR /&gt;
&lt;BR /&gt;
I'd like to see the OP's program that created the 1000 simulated data sets.</description>
      <pubDate>Fri, 24 Sep 2010 17:46:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dataset-structure-for-regression/m-p/74409#M16032</guid>
      <dc:creator>data_null__</dc:creator>
      <dc:date>2010-09-24T17:46:16Z</dc:date>
    </item>
  </channel>
</rss>

