<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to programmatically create multiple subsets based on conditions? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560663#M156841</link>
    <description>The macro %dothis was very cool, thank you! I will keep in mind not to make the subsets unless it's necessary to do so. Thanks for the suggestions and reminders!</description>
    <pubDate>Tue, 21 May 2019 20:44:53 GMT</pubDate>
    <dc:creator>aaronh</dc:creator>
    <dc:date>2019-05-21T20:44:53Z</dc:date>
    <item>
      <title>How to programmatically create multiple subsets based on conditions?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560615#M156813</link>
      <description>&lt;P&gt;(My post here may seem verbose, but I hope that it will at least be clearly stated with the help of the context, the example dataset, and my codes.)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;CONTEXT: I have a very large dataset that contains a date variable, and I am trying to come up with a way to subset this large dataset BY the year of this date variable (i.e. so that one subset contains ALL rows for 2018, and another contains ALL rows for 2019. So on and so forth).&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;IDEA: Instead of using an &lt;EM&gt;if ... then output&lt;/EM&gt; for each of the years in my data, I have attempted (and failed) to use a do loop and macro variables. (I'd also welcome other approaches that will get me the same results.)&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Just to illustrate my idea, assume that we have an imaginary table called 'test' that contains the birth year of a few kids:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;&lt;STRONG&gt;&lt;FONT face="courier new,courier"&gt;data&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="courier new,courier"&gt; test;&lt;/FONT&gt;
&lt;FONT face="courier new,courier"&gt;&lt;FONT color="#3366FF"&gt;input&lt;/FONT&gt; yr;
&lt;FONT color="#3366FF"&gt;datalines&lt;/FONT&gt;;&lt;/FONT&gt;
&lt;FONT face="courier new,courier"&gt;2019
2019
2018
2018
2018
2017
2016
2016
2015
2015
;&lt;/FONT&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The table above contains years from 2015 to 2019 (spanning 5 years). And I've written the following codes in an attempt to programmatically to create 5 subsets by the birth year:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;&lt;FONT face="courier new,courier"&gt;&lt;FONT color="#008000"&gt;** To save the max and min year as macro var **;&lt;/FONT&gt;
&lt;STRONG&gt;proc sql&lt;/STRONG&gt; &lt;FONT color="#3366FF"&gt;noprint&lt;/FONT&gt;;
  &lt;FONT color="#3366FF"&gt;select&lt;/FONT&gt; min(yr), max(yr)
  &lt;FONT color="#3366FF"&gt;into&lt;/FONT&gt; :min_yr, 
       :max_yr
  &lt;FONT color="#3366FF"&gt;from&lt;/FONT&gt; test;
&lt;STRONG&gt;quit;&lt;/STRONG&gt;

&lt;FONT color="#3366FF"&gt;%let&lt;/FONT&gt; int = &lt;FONT color="#3366FF"&gt;%eval&lt;/FONT&gt;(&amp;amp;max_yr - &amp;amp;min_yr); &lt;FONT color="#008000"&gt;**int denotes the spanning interval**;&lt;/FONT&gt;&lt;/FONT&gt;

&lt;FONT face="courier new,courier"&gt;&lt;FONT color="#008000"&gt;** Subset using do loop **;&lt;/FONT&gt;
&lt;STRONG&gt;data&lt;/STRONG&gt; &lt;FONT color="#3366FF"&gt;_null_&lt;/FONT&gt;;
  &lt;FONT color="#3366FF"&gt;set&lt;/FONT&gt; test;
  &lt;FONT color="#3366FF"&gt;do&lt;/FONT&gt; i=&lt;FONT color="#339966"&gt;&lt;STRONG&gt;0&lt;/STRONG&gt;&lt;/FONT&gt; to &amp;amp;&lt;FONT color="#339966"&gt;int.&lt;/FONT&gt;;
    &lt;FONT color="#3366FF"&gt;call&lt;/FONT&gt; symput(&lt;FONT color="#993366"&gt;'iter'&lt;/FONT&gt;, i);
    &lt;FONT color="#3366FF"&gt;if&lt;/FONT&gt; yr = &lt;FONT color="#3366FF"&gt;%eval&lt;/FONT&gt;(&amp;amp;&lt;FONT color="#339966"&gt;min_yr.&lt;/FONT&gt; + i) &lt;FONT color="#3366FF"&gt;then output&lt;/FONT&gt; &lt;FONT color="#993366"&gt;"data&amp;amp;iter."&lt;/FONT&gt;;
  &lt;FONT color="#3366FF"&gt;end;&lt;/FONT&gt;
&lt;STRONG&gt;run;&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Unfortunately, my codes did not work, because inside the %eval( ) function, the "i" was not assigned a value. And SAS log had an error message that reads: &lt;FONT color="#993366"&gt;A character operand was found in the %eval function ... where a numeric operand is required.&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sorry for making this so lengthy, but I'd really appreciate some help.&lt;/P&gt;</description>
      <pubDate>Tue, 21 May 2019 18:37:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560615#M156813</guid>
      <dc:creator>aaronh</dc:creator>
      <dc:date>2019-05-21T18:37:19Z</dc:date>
    </item>
    <item>
      <title>Re: How to programmatically create multiple subsets based on conditions?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560616#M156814</link>
      <description>&lt;PRE&gt;if yr = %eval(&amp;amp;min_yr. + i) then output "data&amp;amp;iter.";&lt;/PRE&gt;
&lt;P&gt;The %eval function needs macro variables or numbers to work properly. It doesn't know what to do with the data step variable&lt;FONT face="courier new,courier"&gt; i&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;When trying to get macros like this to work, you first need to have a data step that works without macros and without macro variables. Then, you should be able to substitute the macro variables and macro code to get it to work. Then, also, you could see that putting a name of a data set inside double quotes, as you did above, is not necessary, and will not work.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So, this works without macro variables and without macros (and I have saved myself some typing by assuming the only years are 2015 2016 and 2017).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data data2015 data2016 data2017;
  set test;
  if yr = 2015 then output data2015;
  if yr = 2016 then output data2016;
  if yr = 2017 then output data2017;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;So now you can turn this into a macro by placing macro loops in the places where the years are shown above. The macro processor really just replaces text upon program execution, and so you want a macro variable (let's call it &amp;amp;y) to have values (in the loop) going from 2015 to 2017.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;%macro dothis;
     data %do y=&amp;amp;min_yr %to &amp;amp;max_yr; data&amp;amp;y %end; ;
         set test;
         %do y=&amp;amp;min_yr %to &amp;amp;max_yr;
              if yr=&amp;amp;y then output data&amp;amp;y;
          %end;
       run;
%mend;
%dothis&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Finally, as a general comment, but usually an important concept that probably applies here ... it is inefficient to split up data sets into yearly data sets. If you need analyses by year, and you keep the entire data set whole (instead of splitting it up), you can get yearly analyses by using the BY command in your SAS code. If I were you, I would see if this works satisfactorily before considering splitting the data up into data sets by year.&amp;nbsp;In other words, my advice to you is: &lt;STRONG&gt;DON'T DO THIS&lt;/STRONG&gt; unless you are convinced that there's no way to get what you want with a single large data set.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 21 May 2019 19:27:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560616#M156814</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2019-05-21T19:27:14Z</dc:date>
    </item>
    <item>
      <title>Re: How to programmatically create multiple subsets based on conditions?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560619#M156816</link>
      <description>&lt;P&gt;You are close, but as you saw you cannot use mix data step and macro logic in that way.&lt;/P&gt;
&lt;P&gt;It is easier to just have the data step generate the code.&amp;nbsp; One way to generate code from a data step is to use CALL EXECUTE().&lt;/P&gt;
&lt;P&gt;So first get the range of years.&amp;nbsp; Do you want all years from min to max, even those without any records?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc summary data=test ;
  var yr ;
  output out=range min=min_yr max=max_yr;
run;
data _null_;
   set range;
   call execute('data');
   do yr=min_yr to max_yr ;
      call execute(' '||cats('data',yr));
   end;
   call execute(';set test;');
   do yr=min_yr to max_yr;
      call execute(catx(' ','if yr=',yr,'then output',cats('data',yr),';'));
    end;
   call execute('run;');
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Or just the years that exist?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc summary data=test nway;
  class yr ;
  output out=range;
run;
data _null_;
   call execute('data');
   do until(eof1);
     set range end=eof1;
     call execute(' '||cats('data',yr));
   end;
   call execute(';set test;');
   do until(eof2);
     set range end=eof2;
     call execute(catx(' ','if yr=',yr,'then output',cats('data',yr),';'));
   end;
   call execute('run;');
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 21 May 2019 18:57:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560619#M156816</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2019-05-21T18:57:06Z</dc:date>
    </item>
    <item>
      <title>Re: How to programmatically create multiple subsets based on conditions?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560621#M156818</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/274317"&gt;@aaronh&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Index your yr column and that will do an implicit sort and then hash is a breeze&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
data test;
input yr;
datalines;
2019
2019
2018
2018
2018
2017
2016
2016
2015
2015
;


proc sql ;                                                              
  create index  yr on test (yr) ;                      
quit ;                                                                  
                                                                        
data _null_ ;                                                           
  if _n_ = 1 then do ;                                                  
    dcl hash h () ;                                                     
    h.definekey  ("_n_") ;                                              
    h.definedata ('yr') ;
    h.definedone () ;                                                   
  end ;                                                                 
  do _n_ = 1 by 1 until (last.yr) ;                            
    set test ;                                                          
    by yr ;                                                    
    h.add() ;                                                           
  end ;                                                                 
  h.output (dataset: catx ("_", "year", yr)) ;          
  h.clear() ;                                                           
run ;   &lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 21 May 2019 19:07:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560621#M156818</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2019-05-21T19:07:01Z</dc:date>
    </item>
    <item>
      <title>Re: How to programmatically create multiple subsets based on conditions?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560625#M156819</link>
      <description>&lt;P&gt;If your date variable is an actual SAS date variable for a very large number of purposes you can create groups directly as you describe without adding any variable just use the correct format:&lt;/P&gt;
&lt;PRE&gt;data example;
   do date = '01Jan1960'd to '18Jul2020'd;
   output;
   end;
run;

proc freq data=example;
   tables date;
   format date year4.;
run;&lt;/PRE&gt;
&lt;P&gt;or&lt;/P&gt;
&lt;PRE&gt;proc freq data=example;
   tables date;
   format date yyq6.;
run;&lt;/PRE&gt;
&lt;P&gt;or&lt;/P&gt;
&lt;PRE&gt;proc freq data=example;
   tables date;
   format date yymon.;
run;&lt;/PRE&gt;
&lt;P&gt;Groups created by a format applied to a SAS date variable are honored by most analysis and graphing procedures as well as reporting and summarizing such as Proc Freq , Tabulate, Report.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Or leave the data together and only subset as needed:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;proc freq data=example;
   where year(date) &amp;gt; 2010;
   tables date;
   format date yymon.;
run;&lt;/PRE&gt;</description>
      <pubDate>Tue, 21 May 2019 19:34:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560625#M156819</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2019-05-21T19:34:38Z</dc:date>
    </item>
    <item>
      <title>Re: How to programmatically create multiple subsets based on conditions?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560632#M156824</link>
      <description>Thank you so much, Paige! The only reason I was trying to use a do loop here, instead of the if then output, is because I will likely use the same procedure again, sometimes without knowing exactly what years are present in my data. But I can always use the data step if then output if I have no better ways to do this. Thanks again!</description>
      <pubDate>Tue, 21 May 2019 19:52:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560632#M156824</guid>
      <dc:creator>aaronh</dc:creator>
      <dc:date>2019-05-21T19:52:22Z</dc:date>
    </item>
    <item>
      <title>Re: How to programmatically create multiple subsets based on conditions?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560640#M156828</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/274317"&gt;@aaronh&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;Thank you so much, Paige! The only reason I was trying to use a do loop here, instead of the if then output, is because I will likely use the same procedure again, sometimes without knowing exactly what years are present in my data. But I can always use the data step if then output if I have no better ways to do this. Thanks again!&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Except I also advised you to NOT break up the data set into data sets by year, except as a last resort. I think what you are trying to do is unnecessary extra work that only slows things down in the long run, IN MOST CASES.&lt;/P&gt;</description>
      <pubDate>Tue, 21 May 2019 20:14:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560640#M156828</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2019-05-21T20:14:02Z</dc:date>
    </item>
    <item>
      <title>Re: How to programmatically create multiple subsets based on conditions?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560649#M156833</link>
      <description>Why are you trying to split it up? If you're exporting, using ODS EXCEL with BY group processing can put each data set to a unique worksheet automatically. Or there are many other options. Usually the only reason to split it up, it's too big for SAS UE or a statistical proc that relies on memory, or for proc optmodel. Otherwise, there's almost always a better approach.</description>
      <pubDate>Tue, 21 May 2019 20:24:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560649#M156833</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-05-21T20:24:44Z</dc:date>
    </item>
    <item>
      <title>Re: How to programmatically create multiple subsets based on conditions?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560663#M156841</link>
      <description>The macro %dothis was very cool, thank you! I will keep in mind not to make the subsets unless it's necessary to do so. Thanks for the suggestions and reminders!</description>
      <pubDate>Tue, 21 May 2019 20:44:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560663#M156841</guid>
      <dc:creator>aaronh</dc:creator>
      <dc:date>2019-05-21T20:44:53Z</dc:date>
    </item>
    <item>
      <title>Re: How to programmatically create multiple subsets based on conditions?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560667#M156843</link>
      <description>p.s. thanks for providing the thought process behind how you arrived at the solution. It really helped me understand why you did these.</description>
      <pubDate>Tue, 21 May 2019 21:01:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-programmatically-create-multiple-subsets-based-on/m-p/560667#M156843</guid>
      <dc:creator>aaronh</dc:creator>
      <dc:date>2019-05-21T21:01:19Z</dc:date>
    </item>
  </channel>
</rss>

