<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to create multiple datasets for each zip code and get text clustering output in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/How-to-create-multiple-datasets-for-each-zip-code-and-get-text/m-p/535221#M9985</link>
    <description>&lt;P&gt;Although I am not familiar&amp;nbsp;with the software you are using, the "SAS Way" to handle this is to use BY-group processing to analyze each ZIP code. Check your documentation to see if it has an example that demonstrates the "BY statement", "BY processing", or group processing."&lt;/P&gt;</description>
    <pubDate>Wed, 13 Feb 2019 15:27:22 GMT</pubDate>
    <dc:creator>Rick_SAS</dc:creator>
    <dc:date>2019-02-13T15:27:22Z</dc:date>
    <item>
      <title>How to create multiple datasets for each zip code and get text clustering output</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-to-create-multiple-datasets-for-each-zip-code-and-get-text/m-p/535197#M9984</link>
      <description>&lt;TABLE style="border-collapse: collapse; width: 48pt;" border="0" width="64" cellspacing="0" cellpadding="0"&gt;
&lt;TBODY&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD width="64" height="20" style="height: 15.0pt; width: 48pt;"&gt;Hello Everyone..&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;I am using SAS Eminer 12.1 and using text parsing, filtering and finally getting results of text clustering. I am happy with unstructued data of Address, nicely classified in appropriate clusters.&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;Now I want this intelligent clustering within each zip code. Typically in 1 city we have more than 50 zip codes (and in addition multiple cities within country!)&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;The sample data lines are (For Example)&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;Address&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Customer_id&amp;nbsp; &amp;nbsp; &amp;nbsp; Zipcode&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;infotech park, andheri&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 400701&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;nearby nandkamal garden, juhu&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 2&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;400701&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;people colony, sion&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;500&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 400701&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;industrial tower, turbhe&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;501&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 400702&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;government quarters, belapur&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 502&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 400702&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;saint international school, vashi&amp;nbsp; &amp;nbsp; &amp;nbsp; 1000&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;400702&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;and so on&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;50 zip codes (having more than 500 customers within each zipcode)&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;So is it possible to create a loop where SAS will create multiple dataset based on say 50 zipcodes, 50 data sets will get created or when&amp;nbsp; text clustering happens, it will create clustering grouped by zip code. So for each zip code we will have typically 20-25 clusters which we get and in final output we will have 20*50 ie. 1000 clusters, grouped by zip code&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;This will be really helpful. Hope I am able to explain&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;Regards&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;Vishal Kapasi&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR style="height: 15.0pt;"&gt;
&lt;TD height="20" style="height: 15.0pt;"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;</description>
      <pubDate>Wed, 13 Feb 2019 14:35:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-to-create-multiple-datasets-for-each-zip-code-and-get-text/m-p/535197#M9984</guid>
      <dc:creator>VISHALKAPASI</dc:creator>
      <dc:date>2019-02-13T14:35:06Z</dc:date>
    </item>
    <item>
      <title>Re: How to create multiple datasets for each zip code and get text clustering output</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-to-create-multiple-datasets-for-each-zip-code-and-get-text/m-p/535221#M9985</link>
      <description>&lt;P&gt;Although I am not familiar&amp;nbsp;with the software you are using, the "SAS Way" to handle this is to use BY-group processing to analyze each ZIP code. Check your documentation to see if it has an example that demonstrates the "BY statement", "BY processing", or group processing."&lt;/P&gt;</description>
      <pubDate>Wed, 13 Feb 2019 15:27:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-to-create-multiple-datasets-for-each-zip-code-and-get-text/m-p/535221#M9985</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2019-02-13T15:27:22Z</dc:date>
    </item>
  </channel>
</rss>

