<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic re-loading in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58566#M12688</link>
    <description>I will be getting data from 50 vendors every month.and I wont get them all at once.I load the data to master table.&lt;BR /&gt;
  It is difficult to keep track of the data that has been loaded.&lt;BR /&gt;
Say for instance I loaded Caremark data day before yesterday and if I'm loading other data from different vendors today.Is there a way where even if I accidentally run the caremark script and yet not load (append) to master table if it is already present.&lt;BR /&gt;
&lt;BR /&gt;
Key variables are "provider" and "file_date".</description>
    <pubDate>Wed, 17 Feb 2010 14:48:59 GMT</pubDate>
    <dc:creator>SASPhile</dc:creator>
    <dc:date>2010-02-17T14:48:59Z</dc:date>
    <item>
      <title>re-loading</title>
      <link>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58566#M12688</link>
      <description>I will be getting data from 50 vendors every month.and I wont get them all at once.I load the data to master table.&lt;BR /&gt;
  It is difficult to keep track of the data that has been loaded.&lt;BR /&gt;
Say for instance I loaded Caremark data day before yesterday and if I'm loading other data from different vendors today.Is there a way where even if I accidentally run the caremark script and yet not load (append) to master table if it is already present.&lt;BR /&gt;
&lt;BR /&gt;
Key variables are "provider" and "file_date".</description>
      <pubDate>Wed, 17 Feb 2010 14:48:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58566#M12688</guid>
      <dc:creator>SASPhile</dc:creator>
      <dc:date>2010-02-17T14:48:59Z</dc:date>
    </item>
    <item>
      <title>Re: re-loading</title>
      <link>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58567#M12689</link>
      <description>Yes off course.&lt;BR /&gt;
&lt;BR /&gt;
- Match by key (implies reading both table, unless you use an index on the master table).&lt;BR /&gt;
&lt;BR /&gt;
- Append the data, then eliminate duplicated rows by key.&lt;BR /&gt;
&lt;BR /&gt;
- Keep track of the loaded keys in a smaller table, and check the table before appending data.&lt;BR /&gt;
&lt;BR /&gt;
Latest being the less resource demanding, you need to make sure that every process who updates the master table, will insert the key in the tracking table.&lt;BR /&gt;
&lt;BR /&gt;
Cheers from Portugal.&lt;BR /&gt;
&lt;BR /&gt;
Daniel Santos @ &lt;A href="http://www.cgd.pt" target="_blank"&gt;www.cgd.pt&lt;/A&gt;</description>
      <pubDate>Wed, 17 Feb 2010 16:04:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58567#M12689</guid>
      <dc:creator>DanielSantos</dc:creator>
      <dc:date>2010-02-17T16:04:55Z</dc:date>
    </item>
    <item>
      <title>Re: re-loading</title>
      <link>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58568#M12690</link>
      <description>Just to add another option to Daniel's list:&lt;BR /&gt;
- use a control table where you keep track of which source file has already been loaded</description>
      <pubDate>Wed, 17 Feb 2010 19:41:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58568#M12690</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2010-02-17T19:41:26Z</dc:date>
    </item>
    <item>
      <title>Re: re-loading</title>
      <link>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58569#M12691</link>
      <description>You want to check the documentation on the update statement, which was designed for this situation.&lt;BR /&gt;
&lt;BR /&gt;
In the case you consider, if you run your script more than once, you will update the master data set to the same values it has already.  Net result: no change.&lt;BR /&gt;
&lt;BR /&gt;
The basic syntax is;&lt;BR /&gt;
&lt;BR /&gt;
data master;  /*or a new data set name*/&lt;BR /&gt;
   update master&lt;BR /&gt;
              transaction;  by keyvars;&lt;BR /&gt;
&lt;BR /&gt;
The master data set should be unique in the keyvars.  The transaction data set can have multiple transactions.&lt;BR /&gt;
&lt;BR /&gt;
See the documentation for the gory details.&lt;BR /&gt;
&lt;BR /&gt;
Jonathan</description>
      <pubDate>Wed, 17 Feb 2010 22:36:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58569#M12691</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2010-02-17T22:36:47Z</dc:date>
    </item>
    <item>
      <title>Re: re-loading</title>
      <link>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58570#M12692</link>
      <description>append does provide the fastest solution, but seems to offer no protection against the scenario you describe (repeating the append), except for an almost hidden feature. &lt;BR /&gt;
Put a unique index on your "master" dataset, and host it on the base SAS9 SPDE engine library. Then you can use the option UNIQSAVE=, to append with respect for the index. &lt;BR /&gt;
See the example in the documentation at &lt;A href="http://support.sas.com/documentation/cdl/en/engspde/61887/HTML/default/a002612268.htm" target="_blank"&gt;http://support.sas.com/documentation/cdl/en/engspde/61887/HTML/default/a002612268.htm&lt;/A&gt;  &lt;BR /&gt;
 &lt;BR /&gt;
PeterC

it is not a proc append option as much as a Data Set Option, to be applied to the BASE= data set.     Message was edited by: Peter.C</description>
      <pubDate>Thu, 18 Feb 2010 11:43:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58570#M12692</guid>
      <dc:creator>Peter_C</dc:creator>
      <dc:date>2010-02-18T11:43:38Z</dc:date>
    </item>
    <item>
      <title>Re: re-loading</title>
      <link>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58571#M12693</link>
      <description>Remember to keep one or more backup copies of your master data in case a restore / recovery process is required -- best to test the process as well at some point.  A backup might be the entire SAS data library (using PROC COPY) or maybe using PROC DATASETS and AGE to keep history "cycles" of data / control files.&lt;BR /&gt;
&lt;BR /&gt;
Scott Barry&lt;BR /&gt;
SBBWorks, Inc.</description>
      <pubDate>Thu, 18 Feb 2010 13:50:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58571#M12693</guid>
      <dc:creator>sbb</dc:creator>
      <dc:date>2010-02-18T13:50:08Z</dc:date>
    </item>
    <item>
      <title>Re: re-loading</title>
      <link>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58572#M12694</link>
      <description>The SPDE engine per se, is quite a powerful feature, but I wouldn't recomend it for the average user, as it is an advanced feature very system dependent.&lt;BR /&gt;
&lt;BR /&gt;
Beside the options stated above, I will add another that you consider or not.&lt;BR /&gt;
&lt;BR /&gt;
Say you manage to store every received table with a unique name, you could then consolidate all the data on a single view.&lt;BR /&gt;
&lt;BR /&gt;
For example, lets say you have one table per vendor, if and only if, each table name could be in some way associated with the corresponding vendor (for example vendor name or vendor ID), the you just have to keep all the tables in a single libname and create a view over them. &lt;BR /&gt;
&lt;BR /&gt;
To access the data, you just have to open the view which will consolidate all the tables into a single one. And updating the data for a specified vendor is actually very easy, you just have to overwrite the vendor table with the one you received.&lt;BR /&gt;
&lt;BR /&gt;
Cheers from Portugal.&lt;BR /&gt;
&lt;BR /&gt;
Daniel Santos @ &lt;A href="http://www.cgd.pt" target="_blank"&gt;www.cgd.pt&lt;/A&gt;</description>
      <pubDate>Thu, 18 Feb 2010 14:40:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58572#M12694</guid>
      <dc:creator>DanielSantos</dc:creator>
      <dc:date>2010-02-18T14:40:19Z</dc:date>
    </item>
    <item>
      <title>Re: re-loading</title>
      <link>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58573#M12695</link>
      <description>I over-engineered with the SPDE solution.&lt;BR /&gt;
Having a unique index is enough to reject rows already in the index.&lt;BR /&gt;
The merit of that SPDE data set option (probably unneccessary here) is that rejected rows can be collected for reprocessing .. modify/update/sql/merge.&lt;BR /&gt;
Since the required behaviour is just rejection, probably collecting the rejected "appends" is not needed.&lt;BR /&gt;
This sasLog snippet demos the non-SPDE run[pre]407  **********************************************************************;&lt;BR /&gt;
409  libname t (work) ;&lt;BR /&gt;
NOTE: Libref T was successfully assigned as follows:&lt;BR /&gt;
      Levels:           1&lt;BR /&gt;
      Engine(1):        V9&lt;BR /&gt;
      Physical Name(1): C:\Users\PETERC~1\AppData\Local\Temp\SAS Temporary Files\_TD8120&lt;BR /&gt;
410  data t.name1 ;&lt;BR /&gt;
411    input name $ value ;&lt;BR /&gt;
412  list;cards ;&lt;BR /&gt;
&lt;BR /&gt;
RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+-&lt;BR /&gt;
413        Jack 123&lt;BR /&gt;
414        Jill 3210&lt;BR /&gt;
415        Hill 0&lt;BR /&gt;
NOTE: The data set T.NAME1 has 3 observations and 2 variables.&lt;BR /&gt;
NOTE: DATA statement used (Total process time):&lt;BR /&gt;
      real time           0.09 seconds&lt;BR /&gt;
      cpu time            0.00 seconds&lt;BR /&gt;
&lt;BR /&gt;
416  ;&lt;BR /&gt;
417  data t.name2( index=( name/unique  ) ) ;&lt;BR /&gt;
418    input name $ value ;&lt;BR /&gt;
419  list;cards ;&lt;BR /&gt;
&lt;BR /&gt;
RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+-&lt;BR /&gt;
420        Jill 4321&lt;BR /&gt;
421        Hill 0&lt;BR /&gt;
NOTE: The data set T.NAME2 has 2 observations and 2 variables.&lt;BR /&gt;
NOTE: DATA statement used (Total process time):&lt;BR /&gt;
      real time           0.13 seconds&lt;BR /&gt;
      cpu time            0.00 seconds&lt;BR /&gt;
&lt;BR /&gt;
422  ;&lt;BR /&gt;
423&lt;BR /&gt;
424 &lt;BR /&gt;
425  proc append  data= name1 base= t.name2 ;*( uniquesave=yes) ;&lt;BR /&gt;
426  run ;&lt;BR /&gt;
&lt;BR /&gt;
NOTE: Appending WORK.NAME1 to T.NAME2.&lt;BR /&gt;
WARNING: Duplicate values not allowed on index name for file NAME2, 2 observations rejected.&lt;BR /&gt;
NOTE: There were 3 observations read from the data set WORK.NAME1.&lt;BR /&gt;
&lt;B&gt;NOTE: 1 observations added.&lt;/B&gt;&lt;BR /&gt;
NOTE: The data set T.NAME2 has 3 observations and 2 variables.&lt;BR /&gt;
NOTE: PROCEDURE APPEND used (Total process time):&lt;BR /&gt;
      real time           0.10 seconds&lt;BR /&gt;
      cpu time            0.03 seconds[/pre]&lt;BR /&gt;
note the "NOTE: 1 observations added."</description>
      <pubDate>Thu, 18 Feb 2010 15:27:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/re-loading/m-p/58573#M12695</guid>
      <dc:creator>Peter_C</dc:creator>
      <dc:date>2010-02-18T15:27:39Z</dc:date>
    </item>
  </channel>
</rss>

