<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How do I split a dataset into smaller data set by variables in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331169#M74389</link>
    <description>&lt;P&gt;I've been asked to help someone split a dataset into smaller datasets by groups of variables.&amp;nbsp; The input dataset is 1 million records and 12,000 variables.&amp;nbsp; All they want to do is run a proc compare between 2 versions of this dataset, but it is too large for this to work.&amp;nbsp; Is there a way to split this into about 12 datasets with about 1000 variables in each?&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 09 Feb 2017 14:38:12 GMT</pubDate>
    <dc:creator>Barbara_66</dc:creator>
    <dc:date>2017-02-09T14:38:12Z</dc:date>
    <item>
      <title>How do I split a dataset into smaller data set by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331169#M74389</link>
      <description>&lt;P&gt;I've been asked to help someone split a dataset into smaller datasets by groups of variables.&amp;nbsp; The input dataset is 1 million records and 12,000 variables.&amp;nbsp; All they want to do is run a proc compare between 2 versions of this dataset, but it is too large for this to work.&amp;nbsp; Is there a way to split this into about 12 datasets with about 1000 variables in each?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Feb 2017 14:38:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331169#M74389</guid>
      <dc:creator>Barbara_66</dc:creator>
      <dc:date>2017-02-09T14:38:12Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split a dataset into smaller data set by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331174#M74390</link>
      <description>&lt;P&gt;Its not going to be any quicker spltting the file up, in fact, taking into account the time to split the dataset, read/write, and loop over the datasets, its most likely to be longer. &amp;nbsp;Why do you have a dataset with 12000 variables? &amp;nbsp;100 variable is way too much for a dataset. &amp;nbsp;Consider approaching that with an eye to &amp;nbsp;re-modelling the data.&lt;/P&gt;</description>
      <pubDate>Thu, 09 Feb 2017 14:51:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331174#M74390</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2017-02-09T14:51:01Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split a dataset into smaller data set by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331182#M74392</link>
      <description>It's not my data. It won't be quicker to split it up, but I can't even sort this dataset because I run out of workspace.&lt;BR /&gt;</description>
      <pubDate>Thu, 09 Feb 2017 15:08:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331182#M74392</guid>
      <dc:creator>Barbara_66</dc:creator>
      <dc:date>2017-02-09T15:08:27Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split a dataset into smaller data set by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331187#M74397</link>
      <description>&lt;P&gt;&lt;A href="http://blogs.sas.com/content/sasdummy/2015/01/26/how-to-split-one-data-set-into-many/" target="_blank"&gt;http://blogs.sas.com/content/sasdummy/2015/01/26/how-to-split-one-data-set-into-many/&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Feb 2017 15:21:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331187#M74397</guid>
      <dc:creator>FriedEgg</dc:creator>
      <dc:date>2017-02-09T15:21:24Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split a dataset into smaller data set by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331195#M74401</link>
      <description>This example splits the data into groups by observations, but I want to maintain all the observations and create datasets that only have 10% of the variables.</description>
      <pubDate>Thu, 09 Feb 2017 15:33:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331195#M74401</guid>
      <dc:creator>Barbara_66</dc:creator>
      <dc:date>2017-02-09T15:33:16Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split a dataset into smaller data set by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331198#M74402</link>
      <description>Sorry, I missed that.&lt;BR /&gt;&lt;BR /&gt;You should sort using tagsort if you are running out of space.&lt;BR /&gt;&lt;BR /&gt;&lt;A href="https://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a000146878.htm" target="_blank"&gt;https://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a000146878.htm&lt;/A&gt;</description>
      <pubDate>Thu, 09 Feb 2017 15:40:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331198#M74402</guid>
      <dc:creator>FriedEgg</dc:creator>
      <dc:date>2017-02-09T15:40:08Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split a dataset into smaller data set by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331200#M74403</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/128219"&gt;@Barbara_66&lt;/a&gt; wrote:&lt;BR /&gt;
&lt;P&gt;I've been asked to help someone split a dataset into smaller datasets by groups of variables.&amp;nbsp; The input dataset is 1 million records and 12,000 variables.&amp;nbsp; All they want to do is run a proc compare between 2 versions of this dataset, but it is too large for this to work.&amp;nbsp; Is there a way to split this into about 12 datasets with about 1000 variables in each?&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note: I hope the two sets are sorted by the same variables before you start.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If there are specific variables of interest then you can reduce the problem by only bringing in the variables of concern and/or using WITH and VAR options or using the FIRSTOBS and OBS or WHERE dataset options&lt;/P&gt;
&lt;P&gt;Some examples:&lt;/P&gt;
&lt;PRE&gt;proc compare base   =dataset1 (where=(groupvar='3'))
             compare=dataset2 (where=(groupvar='3'))
;
   var basevar1 basevar2;
   with comparevar1 comparevar2;
run;

proc compare base   =dataset1 (where=(groupvar='3'))
             compare=dataset2 (where=(groupvar='3'))
;
run;

proc compare base   =dataset1 (firstobs=10000 obs=20000)
             compare=dataset2 (firstobs=10000 obs=20000)
;
run;

proc compare base   =dataset1 (keep=var1 var2 var3 var4)
             compare=dataset2 (keep=var1 var2 var3 var4)
;
run;

&lt;/PRE&gt;
&lt;P&gt;Also which output options are they using? NO one is going to manually look through millions of lines of output.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Feb 2017 15:43:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331200#M74403</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2017-02-09T15:43:35Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split a dataset into smaller data set by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331210#M74407</link>
      <description>&lt;P&gt;%let lib=SASHELP;&lt;BR /&gt;%let data=CARS;&lt;BR /&gt;%let keys=MAKE MODEL;&lt;BR /&gt;%let splits=3;&lt;/P&gt;
&lt;P&gt;%macro qlist(list,dlm=%str( ));&lt;BR /&gt;(%unquote(%str(%")%qsysfunc(tranwrd(&amp;amp;list,%str( ),%str("&amp;amp;dlm")))%str(%")))&lt;BR /&gt;%mend;&lt;/P&gt;
&lt;P&gt;data _null_;&lt;BR /&gt;call streaminit(23463);&lt;BR /&gt;array v[&amp;amp;splits] $ 32767 _temporary_;&lt;/P&gt;
&lt;P&gt;do until (done);&lt;BR /&gt;set sashelp.vcolumn end=done;&lt;BR /&gt;where libname="%upcase(&amp;amp;lib)" and&lt;BR /&gt; memname="%upcase(&amp;amp;data)" and&lt;BR /&gt; upcase(name) not in %qlist(&amp;amp;keys);&lt;/P&gt;
&lt;P&gt;i=(ceil(&amp;amp;splits * rand('uniform')));&lt;BR /&gt;v[i] = catx(' ', v[i], name);&lt;BR /&gt;end;&lt;/P&gt;
&lt;P&gt;do i=1 to &amp;amp;splits;&lt;BR /&gt; call execute('data ' || cats("&amp;amp;data",i) || ';'&lt;BR /&gt;|| "set &amp;amp;lib..&amp;amp;data (keep=&amp;amp;keys " || strip(v[i]) || ');'&lt;BR /&gt;|| 'run;');&lt;BR /&gt; call symputx(cats('keep',i),v[i]);&lt;BR /&gt;end;&lt;BR /&gt;run;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Feb 2017 16:05:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331210#M74407</guid>
      <dc:creator>FriedEgg</dc:creator>
      <dc:date>2017-02-09T16:05:22Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split a dataset into smaller data set by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331241#M74416</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/128219"&gt;@Barbara_66&lt;/a&gt; wrote:&lt;BR /&gt;It's not my data. It won't be quicker to split it up, but I can't even sort this dataset because I run out of workspace.&lt;BR /&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Why do you think you need to sort the data?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Feb 2017 17:04:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331241#M74416</guid>
      <dc:creator>data_null__</dc:creator>
      <dc:date>2017-02-09T17:04:31Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split a dataset into smaller data set by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331259#M74427</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;* splitting sashelp.cars into 'make' datasets;

MAKE             Frequency
-----------------------------
Acura                   7
Audi                   19
BMW                    20
Buick                   9
Cadillac                8
...
Suzuki                  8
Toyota                 28
Volkswagen             15
Volvo                  12


%symdel make; * just in case;
data _null_;
   set sashelp.cars;
   by make;
   if first.make then do;
     call symputx('make',make);
     rc=dosubl('
        data &amp;amp;make;
           set sashelp.cars(where=(make="&amp;amp;make"));
        run;quit;
     ');
   end;
run;quit;

SYMBOLGEN:  Macro variable MAKE resolves to Acura
SYMBOLGEN:  Macro variable MAKE resolves to Acura
NOTE: There were 7 observations read from the data set SASHELP.CARS.
      WHERE make='Acura';
...
SYMBOLGEN:  Macro variable MAKE resolves to Volvo
SYMBOLGEN:  Macro variable MAKE resolves to Volvo
NOTE: There were 12 observations read from the data set SASHELP.CARS.
      WHERE make='Volvo';



&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 09 Feb 2017 17:49:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331259#M74427</guid>
      <dc:creator>rogerjdeangelis</dc:creator>
      <dc:date>2017-02-09T17:49:37Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split a dataset into smaller data set by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331263#M74430</link>
      <description>@rogerdeangelis, this is a vertical split.  User wanted a horizontal split.  I made the same mistake originally.  However, as many have already commented, splitting this file at all makes little sense and there are numerous ways the OP can approach their actual problem without splitting.</description>
      <pubDate>Thu, 09 Feb 2017 17:57:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-a-dataset-into-smaller-data-set-by-variables/m-p/331263#M74430</guid>
      <dc:creator>FriedEgg</dc:creator>
      <dc:date>2017-02-09T17:57:19Z</dc:date>
    </item>
  </channel>
</rss>

