<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: General Key Cleanup After Merging Records in SAS Data Management</title>
    <link>https://communities.sas.com/t5/SAS-Data-Management/General-Key-Cleanup-After-Merging-Records/m-p/245296#M6462</link>
    <description>&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/20678"&gt;@jwhite&lt;/a&gt;, you are usually putting forward interesting questions,  I have to give you that.&lt;BR /&gt;Since this is kinda architectural issue and I don't know the full nature of your environment I don't want to tell what you should do. I'll try to describe what would do and let us see if that can be applied to your situation. &lt;BR /&gt;&lt;BR /&gt;In the detail layer of the DW (non dimensional is my preference) I would accept both keys. And create separate surrogate keys.&lt;BR /&gt;&lt;BR /&gt;The information about matching/merging of keys will be stored in a separate relational table which point from the new key to the original key. If you are using fats vault terminology this is called a same-as-link table.&lt;BR /&gt;&lt;BR /&gt;The actual merge is done when loading the star schemas were you need a logic which key to use. If you are using surrogate keys I would prefer to use the original key. Which business key to use is a business requirement issue. In the case of a type 2 dimension you could show the change of key over time  (given that the key are being replaced).&lt;BR /&gt;&lt;BR /&gt;Hopes this makes an sense &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;</description>
    <pubDate>Thu, 21 Jan 2016 20:55:35 GMT</pubDate>
    <dc:creator>LinusH</dc:creator>
    <dc:date>2016-01-21T20:55:35Z</dc:date>
    <item>
      <title>General Key Cleanup After Merging Records</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/General-Key-Cleanup-After-Merging-Records/m-p/245285#M6460</link>
      <description>&lt;P&gt;I've got what's likely a basic question.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I've inherited a warehouse that has multiple records on a fact table that represent the same individual. For example, they were originally loaded with an Id number of 12345 and assigned a key that we shall call 333. A year or two later the same individual was loaded again but a different Id number was given (12354) so when loaded they were thought to be a new individual thus assigned a new foreign key (444).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Now a few months ago we were sent a new dataset telling us that ID 12345 and 12354 should be the same, so the two Ids were merged and now the fact table has the following two rows for this individual:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;U&gt;Key&lt;/U&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;U&gt;ID&lt;/U&gt;&lt;/P&gt;
&lt;P&gt;333&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 12354&lt;/P&gt;
&lt;P&gt;444&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 12354&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;That part was easy enough. Trick is the keys are used on various tables. Now, some tables will link to ID 12354 via key 333 and other tables from 444. Obviously, this isn't good, especially since several Ids were incorrectly loaded originally..&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is there a better answer than deciding on one key to use, combing through every fact table we have, checking to see what key is used, then assigning it the correct key of the so that all tables use the same key for an ID, then close out either 333 or 444?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Does that make any sense? Seems to be the way to go, but am curious if anyone has had to deal with anything similar.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks, all!&lt;/P&gt;</description>
      <pubDate>Thu, 21 Jan 2016 20:28:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/General-Key-Cleanup-After-Merging-Records/m-p/245285#M6460</guid>
      <dc:creator>jwhite</dc:creator>
      <dc:date>2016-01-21T20:28:23Z</dc:date>
    </item>
    <item>
      <title>Re: General Key Cleanup After Merging Records</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/General-Key-Cleanup-After-Merging-Records/m-p/245295#M6461</link>
      <description>&lt;P&gt;When faced with similar issues my first step has been to make sure I don't have:&lt;/P&gt;
&lt;P&gt;The problematic key associated with yet another Id&lt;/P&gt;
&lt;P&gt;Either of&amp;nbsp;Id associated with yet another Key value.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Since about 1/3 of the time one of the two above conditions fails much fun ensues, but boils down to the following:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Backup&lt;/STRONG&gt; everything so you can restart if needed.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Document&lt;/STRONG&gt; what changes need to be made to make all of the data consistent&lt;/P&gt;
&lt;P&gt;Get approval for the changes&lt;/P&gt;
&lt;P&gt;Change it&lt;/P&gt;
&lt;P&gt;Test and verify the changes worked everywhere&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If there is a source file(s) that if re-read would reintroduce the issue, modify the corresponding records there as well.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But I've never had to deal with data with very many of these types of records, only a few dozen or so.&lt;/P&gt;</description>
      <pubDate>Thu, 21 Jan 2016 20:54:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/General-Key-Cleanup-After-Merging-Records/m-p/245295#M6461</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2016-01-21T20:54:30Z</dc:date>
    </item>
    <item>
      <title>Re: General Key Cleanup After Merging Records</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/General-Key-Cleanup-After-Merging-Records/m-p/245296#M6462</link>
      <description>&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/20678"&gt;@jwhite&lt;/a&gt;, you are usually putting forward interesting questions,  I have to give you that.&lt;BR /&gt;Since this is kinda architectural issue and I don't know the full nature of your environment I don't want to tell what you should do. I'll try to describe what would do and let us see if that can be applied to your situation. &lt;BR /&gt;&lt;BR /&gt;In the detail layer of the DW (non dimensional is my preference) I would accept both keys. And create separate surrogate keys.&lt;BR /&gt;&lt;BR /&gt;The information about matching/merging of keys will be stored in a separate relational table which point from the new key to the original key. If you are using fats vault terminology this is called a same-as-link table.&lt;BR /&gt;&lt;BR /&gt;The actual merge is done when loading the star schemas were you need a logic which key to use. If you are using surrogate keys I would prefer to use the original key. Which business key to use is a business requirement issue. In the case of a type 2 dimension you could show the change of key over time  (given that the key are being replaced).&lt;BR /&gt;&lt;BR /&gt;Hopes this makes an sense &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;</description>
      <pubDate>Thu, 21 Jan 2016 20:55:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/General-Key-Cleanup-After-Merging-Records/m-p/245296#M6462</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2016-01-21T20:55:35Z</dc:date>
    </item>
    <item>
      <title>Re: General Key Cleanup After Merging Records</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/General-Key-Cleanup-After-Merging-Records/m-p/245304#M6463</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13674"&gt;@LinusH﻿&lt;/a&gt;, ha! Well, I wish they weren't as interesting as they are, but I'm happy to give the experts here something to think about. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884"&gt;@ballardw﻿&lt;/a&gt;, I'm _always_ backing up...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I think I'm liking the idea of the link table. As it is, the way this data is used two tables are pretty much always joined anyway, so it would be easier for our reporting team to just use the one table, instead of having to consistently join two tables first.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In doing so, this link table would have both keys in it as two separate columns? Would our report staff need to check each key column when they join or lookup data on this table?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 21 Jan 2016 21:29:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/General-Key-Cleanup-After-Merging-Records/m-p/245304#M6463</guid>
      <dc:creator>jwhite</dc:creator>
      <dc:date>2016-01-21T21:29:00Z</dc:date>
    </item>
    <item>
      <title>Re: General Key Cleanup After Merging Records</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/General-Key-Cleanup-After-Merging-Records/m-p/245306#M6464</link>
      <description>Yes as separate columns.&lt;BR /&gt;But this construct might feel a bit complex so the idea is to keep this as a reference table behind the scene  and use it only for the data mart load. Report users shouldn't be forced to consider this kind of logic. But of course, they could if you chose to let them.</description>
      <pubDate>Thu, 21 Jan 2016 21:37:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/General-Key-Cleanup-After-Merging-Records/m-p/245306#M6464</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2016-01-21T21:37:17Z</dc:date>
    </item>
    <item>
      <title>Re: General Key Cleanup After Merging Records</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/General-Key-Cleanup-After-Merging-Records/m-p/245650#M6472</link>
      <description>&lt;P&gt;I agree it's an interesting problem, as what you've described is the very common duplicate customer issue. In my experience most IT systems that deal with customer data are prone to having duplicate customers. Often they come about because of inadequate checking of existing customers when adding a new one. Where I am working now though they are also created deliberately because of system restrictions!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I also agree that a link table or what I would call a duplicate customer table is the way to go. As a general rule it is not good practice to change historical fact table data with corrections found later. The reason for this is if you repeat reports for that same period you will get different answers. A data warehouse should give you repeatable results for the same point in time no matter when you run them.There are exceptions to this of course perhaps where the errors are so large they need to be restated.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 23 Jan 2016 23:15:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/General-Key-Cleanup-After-Merging-Records/m-p/245650#M6472</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2016-01-23T23:15:58Z</dc:date>
    </item>
  </channel>
</rss>

