<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic RE: Splitting large data by variables in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164277#M300311</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;SPAN lang="EN" style="color: #333333; line-height: 115%; font-family: 'Helvetica','sans-serif'; font-size: 10pt; mso-ansi-language: EN;"&gt;Hello friends, &lt;BR data-jive-statusinputadd="true" data-jive-truncation-flag="true" /&gt;&lt;BR /&gt;I need some help in data management, I have a large dataset of&amp;nbsp; 17375 observations and 3997 variables. I wish to split this date into three sets of 17375 observations 1333 variables, while retaining all the observations and the unique identification code for future re-merging.&lt;BR data-jive-statusinputadd="true" data-jive-truncation-flag="true" /&gt;&lt;BR /&gt;I wish to get help in developing this SAS code for doing the splitting&lt;BR data-jive-statusinputadd="true" data-jive-truncation-flag="true" /&gt;&lt;BR /&gt;Thanks in advance, I would appreciate your assistance&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN lang="EN" style="color: #333333; line-height: 115%; font-family: 'Helvetica','sans-serif'; font-size: 10pt; mso-ansi-language: EN;"&gt;&lt;/SPAN&gt; &lt;/P&gt;&lt;P&gt;&lt;SPAN lang="EN" style="color: #333333; line-height: 115%; font-family: 'Helvetica','sans-serif'; font-size: 10pt; mso-ansi-language: EN;"&gt;Fred&lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Sun, 18 May 2014 17:05:35 GMT</pubDate>
    <dc:creator>thafu</dc:creator>
    <dc:date>2014-05-18T17:05:35Z</dc:date>
    <item>
      <title>RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164277#M300311</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;SPAN lang="EN" style="color: #333333; line-height: 115%; font-family: 'Helvetica','sans-serif'; font-size: 10pt; mso-ansi-language: EN;"&gt;Hello friends, &lt;BR data-jive-statusinputadd="true" data-jive-truncation-flag="true" /&gt;&lt;BR /&gt;I need some help in data management, I have a large dataset of&amp;nbsp; 17375 observations and 3997 variables. I wish to split this date into three sets of 17375 observations 1333 variables, while retaining all the observations and the unique identification code for future re-merging.&lt;BR data-jive-statusinputadd="true" data-jive-truncation-flag="true" /&gt;&lt;BR /&gt;I wish to get help in developing this SAS code for doing the splitting&lt;BR data-jive-statusinputadd="true" data-jive-truncation-flag="true" /&gt;&lt;BR /&gt;Thanks in advance, I would appreciate your assistance&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN lang="EN" style="color: #333333; line-height: 115%; font-family: 'Helvetica','sans-serif'; font-size: 10pt; mso-ansi-language: EN;"&gt;&lt;/SPAN&gt; &lt;/P&gt;&lt;P&gt;&lt;SPAN lang="EN" style="color: #333333; line-height: 115%; font-family: 'Helvetica','sans-serif'; font-size: 10pt; mso-ansi-language: EN;"&gt;Fred&lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 18 May 2014 17:05:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164277#M300311</guid>
      <dc:creator>thafu</dc:creator>
      <dc:date>2014-05-18T17:05:35Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164278#M300312</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;SPAN style="font-size: 12pt;"&gt;&lt;SPAN lang="EN-CA" style="font-family: Arial, sans-serif;"&gt;3997 variables, that's a lot of variables indeed. But splitting them arbitrarily into three sets might not be the best strategy&lt;/SPAN&gt;. It might be better to organize your data differently and keep them in the same dataset. What are these variables?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 18 May 2014 18:58:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164278#M300312</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2014-05-18T18:58:30Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164279#M300313</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Just guessing with a calculation. &lt;BR /&gt;17K observations is not much 4k variables is. Most DBMS systems do not support that amount of columns. 17k * 4K * 8bytes is about 640Mb still not big. Unless longer characters are part of the dataset you do not get to the 32-bit / 2Gb limit. As PGStats is asking what these variables are, what is the real reason to want a split up?&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 18 May 2014 19:28:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164279#M300313</guid>
      <dc:creator>jakarman</dc:creator>
      <dc:date>2014-05-18T19:28:16Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164280#M300314</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I agree with &lt;A __default_attr="2746" __jive_macro_name="user" class="jive_macro jive_macro_user" href="https://communities.sas.com/"&gt;&lt;/A&gt;, having that amount of variables is inconvenient in many ways. Imagine how to write programs to address all the variable by name. The oinly use case I've seen is with data mining that needs the data stacked in variables/columns.&lt;/P&gt;&lt;P&gt;So without knowing your requirements, my guess is that you are better off transposing your data in some way.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 May 2014 07:34:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164280#M300314</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2014-05-19T07:34:02Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164281#M300315</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Completely agree with all previous posts, just wanted to add that you could reduce large datasets into smaller ones using RDBMS theory.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 May 2014 08:24:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164281#M300315</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2014-05-19T08:24:41Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164282#M300316</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;DIV style="font-family: Courier New; font-size: 11pt;"&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;/* The input data set and key variable(s) to include in all data sets */&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV style="font-family: Courier New; font-size: 11pt;"&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;%let&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; data=sashelp.heart;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;%let&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; keys=ageatstart;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV style="font-family: Courier New; font-size: 11pt;"&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;/* A list of variable names withOUT the KEYS*/&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000080; background-color: #ffffff;"&gt;&lt;STRONG&gt;proc&lt;/STRONG&gt;&lt;/SPAN&gt; &lt;SPAN style="color: #000080; background-color: #ffffff;"&gt;&lt;STRONG&gt;contents&lt;/STRONG&gt;&lt;/SPAN&gt; &lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;noprint&lt;/SPAN&gt; &lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;out&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;=sansid(&lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;keep&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;=name &lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;varnum&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;) &lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;data&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;=&amp;amp;data(&lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;drop&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;=&amp;amp;keys);&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="color: #000080; background-color: #ffffff;"&gt;&lt;STRONG&gt;run&lt;/STRONG&gt;&lt;/SPAN&gt;;&lt;/DIV&gt;&lt;DIV style="font-family: Courier New; font-size: 11pt;"&gt;/* put them in varnum order */&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000080; background-color: #ffffff;"&gt;&lt;STRONG&gt;proc&lt;/STRONG&gt;&lt;/SPAN&gt; &lt;SPAN style="color: #000080; background-color: #ffffff;"&gt;&lt;STRONG&gt;sort&lt;/STRONG&gt;&lt;/SPAN&gt; &lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;data&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;=sansid;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;by&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; varnum;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="color: #000080; background-color: #ffffff;"&gt;&lt;STRONG&gt;run&lt;/STRONG&gt;&lt;/SPAN&gt;;&lt;/DIV&gt;&lt;DIV style="font-family: Courier New; font-size: 11pt;"&gt;/* create 3 approximately equal groups */&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000080; background-color: #ffffff;"&gt;&lt;STRONG&gt;proc&lt;/STRONG&gt;&lt;/SPAN&gt; &lt;SPAN style="color: #000080; background-color: #ffffff;"&gt;&lt;STRONG&gt;rank&lt;/STRONG&gt;&lt;/SPAN&gt; &lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;out&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;=sansid &lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;groups&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;=&lt;/SPAN&gt;&lt;SPAN style="color: #008080; background-color: #ffffff;"&gt;&lt;STRONG&gt;3&lt;/STRONG&gt;&lt;/SPAN&gt;; &lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;var&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; varnum;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;ranks&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; group;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="color: #000080; background-color: #ffffff;"&gt;&lt;STRONG&gt;run&lt;/STRONG&gt;&lt;/SPAN&gt;;&lt;/DIV&gt;&lt;DIV style="font-family: Courier New; font-size: 11pt;"&gt;/* Generate data set name with keep= data set option with a */&lt;/DIV&gt;&lt;DIV style="font-family: Courier New; font-size: 11pt;"&gt;/* name range variable list from the first and last name in each GROUP*/&lt;/DIV&gt;&lt;DIV style="font-family: Courier New; font-size: 11pt;"&gt;/* write the generated code to a file*/&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;filename&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; codegen &lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;temp&lt;/SPAN&gt;; &lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000080; background-color: #ffffff;"&gt;&lt;STRONG&gt;data&lt;/STRONG&gt;&lt;/SPAN&gt; &lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;_null_&lt;/SPAN&gt;; &lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;file&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; codegen;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;set&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; sansid;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;by&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; group;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;if&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; first.group &lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;then&lt;/SPAN&gt; &lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;put&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; +&lt;/SPAN&gt;&lt;SPAN style="color: #008080; background-color: #ffffff;"&gt;&lt;STRONG&gt;3&lt;/STRONG&gt;&lt;/SPAN&gt; &lt;SPAN style="color: #800080; background-color: #ffffff;"&gt;'vgroup'&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; group &lt;/SPAN&gt;&lt;SPAN style="color: #800080; background-color: #ffffff;"&gt;'(keep='&lt;/SPAN&gt; &lt;SPAN style="color: #800080; background-color: #ffffff;"&gt;"&amp;amp;keys"&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; +&lt;/SPAN&gt;&lt;SPAN style="color: #008080; background-color: #ffffff;"&gt;&lt;STRONG&gt;1&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; name &lt;/SPAN&gt;&lt;SPAN style="color: #800080; background-color: #ffffff;"&gt;'--'&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; @@;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;if&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; last.group &lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;then&lt;/SPAN&gt; &lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;put&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; +&lt;/SPAN&gt;&lt;SPAN style="color: #008080; background-color: #ffffff;"&gt;&lt;STRONG&gt;1&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; name &lt;/SPAN&gt;&lt;SPAN style="color: #800080; background-color: #ffffff;"&gt;')'&lt;/SPAN&gt;; &lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="color: #000080; background-color: #ffffff;"&gt;&lt;STRONG&gt;run&lt;/STRONG&gt;&lt;/SPAN&gt;;&lt;/DIV&gt;&lt;DIV style="font-family: Courier New; font-size: 11pt;"&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;/* create the new data sets*/&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000080; background-color: #ffffff;"&gt;&lt;STRONG&gt;data&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; &lt;BR /&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;%inc&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; codegen / source2;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&amp;nbsp;&amp;nbsp; ;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV style="font-family: Courier New; font-size: 11pt;"&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&amp;nbsp;&amp;nbsp; /*This merge is important to unsure the keys are on the left and not in the name ranges*/&lt;/SPAN&gt;&lt;SPAN style="background-color: #ffffff; color: #000000; font-size: 11pt; line-height: 1.5em;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV style="font-family: Courier New; font-size: 11pt;"&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;merge&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt; &amp;amp;data(&lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;keep&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;=&amp;amp;keys) &amp;amp;data(&lt;/SPAN&gt;&lt;SPAN style="color: #0000ff; background-color: #ffffff;"&gt;drop&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;=&amp;amp;keys);&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="color: #000080; background-color: #ffffff;"&gt;&lt;STRONG&gt;run&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;SPAN style="color: #000000; background-color: #ffffff;"&gt;;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Message was edited by: data _null_&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 May 2014 10:55:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164282#M300316</guid>
      <dc:creator>data_null__</dc:creator>
      <dc:date>2014-05-19T10:55:51Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164283#M300317</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thank you all for your generosity,&lt;/P&gt;&lt;P&gt;To respond to your questions; this is a government data of household surveillance. The study objective is to "investigate the socio-economic déterminants of health inequality and inequity". The first step was to arbitrary break this large dataset into smaller ones, then asses the important variables for use. The selected varaibles could then be consolidate into afew manageble number via the principle component analysis. Finally, i am to merge the consolidated dataset and perform the core analysis of the study.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I also wished to split this dataset to enable alternative statistical manipulation in STATA 13 Platform, which i have more compétence, but the Platform has limited amount of data it can handle. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Warm regards&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 May 2014 11:14:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164283#M300317</guid>
      <dc:creator>thafu</dc:creator>
      <dc:date>2014-05-19T11:14:39Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164284#M300318</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;BR /&gt;Hello data_null_,&lt;/P&gt;&lt;P&gt;Thanks for the codes. However, for clarity, i have one request, could you please add brief descriptors to your codes to enable me follow through&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;Fred&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 May 2014 11:33:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164284#M300318</guid>
      <dc:creator>thafu</dc:creator>
      <dc:date>2014-05-19T11:33:34Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164285#M300319</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;thafu,&amp;nbsp; You are saying it is government data of household surveillance. Your first job will be understanding the data.&lt;BR /&gt;I assume the records are organized by households. The big number of variables could be caused by some repetition of measurements by time. &lt;BR /&gt;Those could be evaluated as a time-series analysis possible given one predictor. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Having your cleaned optimized that way you can do a next step. Hypothesis testing or using the predictive analytics common with data mining.&lt;/P&gt;&lt;P&gt;The way you are going to do things with your data may be different on those two. &lt;BR /&gt;The data mining approach is needing one or several target values on wich you are going to train&amp;nbsp; and validate. A separation on your data is needed with that.&lt;/P&gt;&lt;P&gt;I am missing that in your question.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 May 2014 11:50:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164285#M300319</guid>
      <dc:creator>jakarman</dc:creator>
      <dc:date>2014-05-19T11:50:09Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164286#M300320</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;PRE __jive_macro_name="quote" class="jive_text_macro jive_macro_quote"&gt;
&lt;P&gt;The first step was to arbitrary break this large dataset into smaller ones, then asses the important variables for use. The selected varaibles could then be consolidate into afew manageble number via the principle component analysis. Finally, i am to merge the consolidated dataset and perform the core analysis of the study.&lt;/P&gt;





&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In my opinion, if the goal is to find "important variables" by some statistical method like principal (not principle) components analysis, then you don't want to split the data at all, you want to run the analysis on the ENTIRE data set. I realize that this may cause problems if your computer doesn't have enough memory, but there are algorithms that would allow principal components to extract a few components (instead of all 3997 components) that would be much less likely to cause issues where you run out of memory.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Splitting the dataset into "arbitrary" thirds is simply the wrong way to go here with any statistical procedure. Furthermore, the unspecified "core analysis" of this study could greatly suffer depending on how you select these "important variables", and it WILL greatly suffer if you select these "important variables" from arbitrary thirds of the data.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 May 2014 13:07:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164286#M300320</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2014-05-19T13:07:40Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164287#M300321</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Being a subject matter specialist you better know which variables should be used for data reduction and may be this is the reason for splitting files or&amp;nbsp; you want to retain only numeric variables in one of the splitted files to apply PCA. If you are considering using principal components then you will have to rely on the principal components instead of original variables for further analyses. If you are looking to retain original variables in the analysis please try to explore proc varclus.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 May 2014 15:24:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164287#M300321</guid>
      <dc:creator>stat_sas</dc:creator>
      <dc:date>2014-05-19T15:24:23Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164288#M300322</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;PRE __jive_macro_name="quote" class="jive_text_macro jive_macro_quote"&gt;
&lt;P&gt;stat@sas wrote:&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Being a subject matter specialist you better know which variables should be used for data reduction and may be this is the reason for splitting files&lt;/P&gt;


&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thafu admitted splitting the data into three groups was "arbitrary", I can't see how this corresponds to using any subject matter expertise&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Furthermore with 3997 variables, I don't see how anyone can use subject matter expertise to pick out the important ones that will matter in a subsequent "core analysis", but that's just me — the whole thing screams "empirical" to me&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 May 2014 16:01:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164288#M300322</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2014-05-19T16:01:48Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164289#M300323</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;@PaigeMiller - thanks for clarification. Arbitrary grouping will make analysis more complicated. In designing surveys this is up to the subject matter specialist how to design surveys. This is a normal practice to put introductory questions in the begining, questions relating to subject in the middle and demo questions at the end on the questionnaire. Questions in the middle section of survey usually contain numeric variables which may be useful for analysis, while questions in the start and end provide classification variables.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 May 2014 16:30:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164289#M300323</guid>
      <dc:creator>stat_sas</dc:creator>
      <dc:date>2014-05-19T16:30:10Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164290#M300324</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;@ &lt;A _jive_internal="true" class="jiveTT-hover-user jive-username-link" data-avatarid="-1" data-externalid="" data-presence="null" data-userid="813022" data-username="stat%40sas" href="https://communities.sas.com/people/stat@sas" id="jive-81302215898890778905711"&gt;stat@sas&lt;/A&gt; as &lt;A _jive_internal="true" class="jiveTT-hover-user jive-username-link" data-avatarid="2032" data-externalid="" data-presence="null" data-userid="9481" data-username="PaigeMiller" href="https://communities.sas.com/people/PaigeMiller" id="jive-948115898890767590711"&gt;PaigeMiller&lt;/A&gt; is worried about the first steps for the analyses I agree with him. The Focus &lt;SPAN class="j-status-levels"&gt; &lt;/SPAN&gt;&lt;SPAN class="j-post-author"&gt;&lt;STRONG&gt;&lt;A _jive_internal="true" class="jiveTT-hover-user jive-username-link" data-avatarid="-1" data-externalid="" data-presence="null" data-userid="815681" data-username="thafu" href="https://communities.sas.com/people/thafu" id="jive-81568115898890662359711"&gt;thafu&lt;/A&gt;&lt;/STRONG&gt; &lt;/SPAN&gt; shows on the coding work but not being experienced in SAS is the reason.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 May 2014 17:05:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164290#M300324</guid>
      <dc:creator>jakarman</dc:creator>
      <dc:date>2014-05-19T17:05:56Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164291#M300325</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;OK, I get the concerns projected.&lt;/P&gt;&lt;P&gt;I am at the initial stage of this project, currently trying to understand the entire data before deciding which variables to retain for the subsequent analysis.&lt;/P&gt;&lt;P&gt;Actually, the enormity(biggness) of the data and the inability to be&amp;nbsp; evaluated (read) in its current form, in the STATA version in my possession, are what prompted my request for arbiterary subdivision.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I should probably have asked if there are any better ways to handle such enormous data, may be by automatically grouping related variables without having to go through variables visually and then physically coding for the selection from the ~4K variables.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 May 2014 17:42:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164291#M300325</guid>
      <dc:creator>thafu</dc:creator>
      <dc:date>2014-05-19T17:42:25Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164292#M300326</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;You keep mentioning a "subsequent analysis" or "core analysis" to be done after you "retain variables" or pick "important variables".&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I don't really think you can talk about retaining variabiles or picking important variables in any meaningful way unless we know what the "subsequent analysis" or "core analysis" is, and you haven't told us.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You can pick x3, x17 and x2309 as being the "important variables" for "Subsequent Analysis-1", but if you end up doing "Subsequent Analysis-2", those variables could be relatively meaningless.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;So I guess there are two issues, one is the hugeness of the data; and the second is the proper analysis; and the maybe one dictates the other or maybe not, I don't know.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Why do you keep mentioning STATA here? Shouldn't we be discussing SAS?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 May 2014 18:01:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164292#M300326</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2014-05-19T18:01:12Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164293#M300327</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I am not familiar with STATA. A quick search informs me that STATA has a CART module. I suggest you start there. First, remove any transformed variable from your dataset, CART is insensitive to monotonic transforations (logs, powers, dummy variables, etc). Then if what's left of your data still doesn't fit in STATA, subsample. That should allow you to see what's meaningful and what's not. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This being a SAS forum, I should also mention that JMP has a decent CART module as well.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 May 2014 19:25:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164293#M300327</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2014-05-19T19:25:35Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164294#M300328</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;A __default_attr="2746" __jive_macro_name="user" class="jive_macro jive_macro_user" href="https://communities.sas.com/"&gt;&lt;/A&gt; makes a good suggestion as well, that probably works in many analysis situations&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;However, the original question remains so vague and undefined that I can't reconcile all the good suggestions in this thread with the original question. Specifically, CART requires a dependent variable, while the original problem asked about principal components, which cannot make use of a dependent variable; and in fact if you are going to do a future analysis with a dependent variable, principal components probably isn't a good first step, and if you are going to do an analysis that doesn't have a dependent variable, then CART probably doesn't fit.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;So I'm coming to the conclusion that the whole idea of picking an analysis method at this point makes no sense here without much more additional information from the original poster.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 20 May 2014 12:32:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164294#M300328</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2014-05-20T12:32:58Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164295#M300329</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;SPAN class="pseudotab3"&gt;&lt;SPAN lang="EN-GB" style="color: #474848; font-family: 'Calibri','sans-serif'; font-size: 9pt; mso-ansi-language: EN-GB;"&gt;Dear SAS Friends &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="pseudotab3"&gt;&lt;SPAN lang="EN-GB" style="color: #474848; font-family: 'Calibri','sans-serif'; font-size: 9pt; mso-ansi-language: EN-GB;"&gt;Sure, as some of you have mentioned, my statistical protocol is not precise as of now. This project intention&lt;BR /&gt;is to produce an accurate report on a sub-Saharan Africa country health situation, focusing on&lt;BR /&gt;the “socioeconomic determinants of health inequality and inequity”. The work to be based on the&lt;BR /&gt;previously applied protocol in two studies; one in Europe (link-&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;STRONG style="text-decoration: underline;"&gt;&lt;SPAN lang="EN-US" style="color: #0070c0; font-family: 'Calibri','sans-serif'; font-size: 9pt; mso-ansi-language: EN-US;"&gt;doi&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;&lt;SPAN lang="EN-US" style="color: #0070c0; font-family: 'Calibri','sans-serif'; font-size: 9pt; mso-ansi-language: EN-US;"&gt;: 10.1056/NEJMsa0707519&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN lang="EN-US" style="color: #474848; font-family: 'Calibri','sans-serif'; font-size: 9pt; mso-ansi-language: EN-US;"&gt;)&lt;/SPAN&gt;&lt;SPAN class="pseudotab3"&gt;&lt;SPAN lang="EN-GB" style="color: #474848; font-family: 'Calibri','sans-serif'; font-size: 9pt; mso-ansi-language: EN-GB;"&gt; and another one in North Africa (link-&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="text-decoration: underline;"&gt;&lt;SPAN lang="EN-GB" style="color: #0070c0; font-family: 'Calibri','sans-serif'; font-size: 9pt; mso-ansi-language: EN-GB;"&gt;doi:10.1186/1475-9276-10-23&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN class="pseudotab3"&gt;&lt;SPAN lang="EN-GB" style="color: #474848; font-family: 'Calibri','sans-serif'; font-size: 9pt; mso-ansi-language: EN-GB;"&gt;). &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="pseudotab3"&gt;&lt;SPAN lang="EN-GB" style="color: #474848; font-family: 'Calibri','sans-serif'; font-size: 9pt; mso-ansi-language: EN-GB;"&gt;&lt;/SPAN&gt;&lt;/SPAN&gt; &lt;/P&gt;&lt;P&gt;&lt;SPAN class="pseudotab3"&gt;&lt;SPAN lang="EN-GB" style="color: #474848; font-family: 'Calibri','sans-serif'; font-size: 9pt; mso-ansi-language: EN-GB;"&gt;Preliminary data understanding revealed that evaluating all the 4K variables will definitely be a hard job. As noted in&lt;BR /&gt;my earlier communication, i was most interested in splicing the data arbitrary&lt;BR /&gt;into small sets, followed by grouping (where i suggested use of principalcomponent analysis).&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="pseudotab3"&gt;&lt;SPAN lang="EN-GB" style="color: #474848; font-family: 'Calibri','sans-serif'; font-size: 9pt; mso-ansi-language: EN-GB;"&gt;&lt;/SPAN&gt;&lt;/SPAN&gt; &lt;/P&gt;&lt;P&gt;&lt;SPAN class="pseudotab3"&gt;&lt;SPAN lang="EN-GB" style="color: #474848; font-family: 'Calibri','sans-serif'; font-size: 9pt; mso-ansi-language: EN-GB;"&gt;My goal sharing was to get hint on simple and efficient ways of summarizing&lt;BR /&gt;or grouping these variables with limited error risks. I am following with keen interest&lt;BR /&gt;this discussion and hope to refine my methodology for the work.&amp;nbsp; Therefore, you suggestions towards this end would&lt;BR /&gt;be highly appreciated.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="pseudotab3"&gt;&lt;SPAN lang="EN-GB" style="color: #474848; font-family: 'Calibri','sans-serif'; font-size: 9pt; mso-ansi-language: EN-GB;"&gt;Thafu&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 21 May 2014 17:00:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164295#M300329</guid>
      <dc:creator>thafu</dc:creator>
      <dc:date>2014-05-21T17:00:42Z</dc:date>
    </item>
    <item>
      <title>Re: RE: Splitting large data by variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164296#M300330</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;The links to those two studies don't appear to work, so we can't know what is in them&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also, I don't know why your reply appears in such small font, but could you please avoid using such a small font in the future? Thank you&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 21 May 2014 17:46:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/RE-Splitting-large-data-by-variables/m-p/164296#M300330</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2014-05-21T17:46:16Z</dc:date>
    </item>
  </channel>
</rss>

