<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Duplicate in SAS dataset. how about normatilization? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/654695#M196600</link>
    <description>I'll just add on to this, DI Studio is designed to support pipelines and normalization, and handle more of the data management side. SAS/BASE/Studio/Foundation are not the same tools but they can do it regardless. &lt;BR /&gt;&lt;BR /&gt;It's more of a build your own though compared to using an Out of the Box tool.</description>
    <pubDate>Mon, 08 Jun 2020 22:50:46 GMT</pubDate>
    <dc:creator>Reeza</dc:creator>
    <dc:date>2020-06-08T22:50:46Z</dc:date>
    <item>
      <title>Duplicate in SAS dataset. how about normatilization?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/654659#M196584</link>
      <description>&lt;P&gt;Hi all the members in the community, hope the best week for all of you. I have a question. why there are lots of&amp;nbsp; duplication in SAS datasets(excel and ...). why we don't use Normalization rules before beginning gathering the data?&lt;/P&gt;
&lt;P&gt;Thank you very much.&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jun 2020 19:36:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/654659#M196584</guid>
      <dc:creator>seamoh</dc:creator>
      <dc:date>2020-06-08T19:36:47Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate in SAS dataset. how about normatilization?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/654660#M196585</link>
      <description>&lt;P&gt;Not sure what you mean by duplicates, but re lack of 'normalization' it's typically because SAS is originally Statistical Analysis Software, which takes a different view of data and normalized data isn't usually good for modelling.&lt;/P&gt;
&lt;P&gt;Edited.&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jun 2020 19:40:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/654660#M196585</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-06-08T19:40:58Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate in SAS dataset. how about normatilization?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/654662#M196587</link>
      <description>&lt;P&gt;Base SAS is not a database in the strictest sense. However it comes close to replicating one. Hence the same rules cannot be applied such as integration check on columns such as column being not null etc. This is in addition to Reeza's answer that SAS is primarily for modeling data and has a different take on what the definition of repeated records mean.&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jun 2020 19:52:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/654662#M196587</guid>
      <dc:creator>smantha</dc:creator>
      <dc:date>2020-06-08T19:52:08Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate in SAS dataset. how about normatilization?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/654663#M196588</link>
      <description>&lt;P&gt;Normalisation is a concept commonly applied to relational database design to enforce data relationships, and optimise disk storage and database transactional performance. It doesn't have the same relevance for analytical data.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jun 2020 19:53:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/654663#M196588</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2020-06-08T19:53:11Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate in SAS dataset. how about normatilization?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/654695#M196600</link>
      <description>I'll just add on to this, DI Studio is designed to support pipelines and normalization, and handle more of the data management side. SAS/BASE/Studio/Foundation are not the same tools but they can do it regardless. &lt;BR /&gt;&lt;BR /&gt;It's more of a build your own though compared to using an Out of the Box tool.</description>
      <pubDate>Mon, 08 Jun 2020 22:50:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/654695#M196600</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-06-08T22:50:46Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate in SAS dataset. how about normatilization?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/655680#M196725</link>
      <description>&lt;P&gt;I have been in SAS area for one month . you and other friends definitely have more experience than me . I was practicing SAS Essential report 1 and I did an exercise and there was unnecessary duplicate in Report.Of course there are lots of syntax could remove duplication in the reports &lt;/P&gt;
&lt;P&gt;then I have this question in my mind.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. Is it better to design good and normalized datasets for gathering the data from the scratch?&lt;/P&gt;
&lt;P&gt;2. Do&amp;nbsp; Data mining and Feature engineering have role in Data analysis in SAS or either in Pyton or R?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I hope I can tell what is in my mind, then you as expert clarify more about.&lt;/P&gt;
&lt;P&gt;Thank you very much,&lt;/P&gt;
&lt;P&gt;I am looking forward for your answers.&lt;/P&gt;
&lt;P&gt;Deeply appreciated&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jun 2020 01:08:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/655680#M196725</guid>
      <dc:creator>seamoh</dc:creator>
      <dc:date>2020-06-10T01:08:01Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate in SAS dataset. how about normatilization?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/655795#M196742</link>
      <description>&lt;P&gt;You will notice that the vast majority of SAS analytical procedures expect a single dataset as input. A strategy I have used often is to keep the data in (quasi-) normalized tables and to define SQL views to expand the data on the fly for analysis. For non trivial datasets, this is the only robust way to ensure data integrity.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jun 2020 03:58:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/655795#M196742</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2020-06-10T03:58:12Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate in SAS dataset. how about normatilization?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/656212#M196766</link>
      <description>&lt;P&gt;Hello and thank you very much for reply. it sounds interesting. would you please introduce me a tutorial described your method or would you please make a tutorial about that. it would help a lot.&lt;/P&gt;
&lt;P&gt;Thank you again.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jun 2020 08:32:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/656212#M196766</guid>
      <dc:creator>seamoh</dc:creator>
      <dc:date>2020-06-10T08:32:47Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate in SAS dataset. how about normatilization?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/656487#M196835</link>
      <description>&lt;P&gt;Find documentation about SAS/SQL views here:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://documentation.sas.com/?docsetId=sqlproc&amp;amp;docsetTarget=n0nolnbokay91in1gouzgw3xzl5e.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en" target="_self"&gt;https://documentation.sas.com/?docsetId=sqlproc&amp;amp;docsetTarget=n0nolnbokay91in1gouzgw3xzl5e.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en&lt;/A&gt; , and here:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://documentation.sas.com/?docsetId=sqlproc&amp;amp;docsetTarget=p1c2yap6eiwukin18hid2r3h2n4h.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en#p0rvte35ms3jvan1so5061kwhxhr" target="_self"&gt;https://documentation.sas.com/?docsetId=sqlproc&amp;amp;docsetTarget=p1c2yap6eiwukin18hid2r3h2n4h.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en#p0rvte35ms3jvan1so5061kwhxhr&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jun 2020 17:18:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplicate-in-SAS-dataset-how-about-normatilization/m-p/656487#M196835</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2020-06-10T17:18:09Z</dc:date>
    </item>
  </channel>
</rss>

