BookmarkSubscribeRSS Feed
SASPhile
Quartz | Level 8
I will be getting data from 50 vendors every month.and I wont get them all at once.I load the data to master table.
It is difficult to keep track of the data that has been loaded.
Say for instance I loaded Caremark data day before yesterday and if I'm loading other data from different vendors today.Is there a way where even if I accidentally run the caremark script and yet not load (append) to master table if it is already present.

Key variables are "provider" and "file_date".
7 REPLIES 7
DanielSantos
Barite | Level 11
Yes off course.

- Match by key (implies reading both table, unless you use an index on the master table).

- Append the data, then eliminate duplicated rows by key.

- Keep track of the loaded keys in a smaller table, and check the table before appending data.

Latest being the less resource demanding, you need to make sure that every process who updates the master table, will insert the key in the tracking table.

Cheers from Portugal.

Daniel Santos @ www.cgd.pt
Patrick
Opal | Level 21
Just to add another option to Daniel's list:
- use a control table where you keep track of which source file has already been loaded
deleted_user
Not applicable
You want to check the documentation on the update statement, which was designed for this situation.

In the case you consider, if you run your script more than once, you will update the master data set to the same values it has already. Net result: no change.

The basic syntax is;

data master; /*or a new data set name*/
update master
transaction; by keyvars;

The master data set should be unique in the keyvars. The transaction data set can have multiple transactions.

See the documentation for the gory details.

Jonathan
Peter_C
Rhodochrosite | Level 12
append does provide the fastest solution, but seems to offer no protection against the scenario you describe (repeating the append), except for an almost hidden feature.
Put a unique index on your "master" dataset, and host it on the base SAS9 SPDE engine library. Then you can use the option UNIQSAVE=, to append with respect for the index.
See the example in the documentation at http://support.sas.com/documentation/cdl/en/engspde/61887/HTML/default/a002612268.htm

PeterC it is not a proc append option as much as a Data Set Option, to be applied to the BASE= data set. Message was edited by: Peter.C
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Remember to keep one or more backup copies of your master data in case a restore / recovery process is required -- best to test the process as well at some point. A backup might be the entire SAS data library (using PROC COPY) or maybe using PROC DATASETS and AGE to keep history "cycles" of data / control files.

Scott Barry
SBBWorks, Inc.
DanielSantos
Barite | Level 11
The SPDE engine per se, is quite a powerful feature, but I wouldn't recomend it for the average user, as it is an advanced feature very system dependent.

Beside the options stated above, I will add another that you consider or not.

Say you manage to store every received table with a unique name, you could then consolidate all the data on a single view.

For example, lets say you have one table per vendor, if and only if, each table name could be in some way associated with the corresponding vendor (for example vendor name or vendor ID), the you just have to keep all the tables in a single libname and create a view over them.

To access the data, you just have to open the view which will consolidate all the tables into a single one. And updating the data for a specified vendor is actually very easy, you just have to overwrite the vendor table with the one you received.

Cheers from Portugal.

Daniel Santos @ www.cgd.pt
Peter_C
Rhodochrosite | Level 12
I over-engineered with the SPDE solution.
Having a unique index is enough to reject rows already in the index.
The merit of that SPDE data set option (probably unneccessary here) is that rejected rows can be collected for reprocessing .. modify/update/sql/merge.
Since the required behaviour is just rejection, probably collecting the rejected "appends" is not needed.
This sasLog snippet demos the non-SPDE run[pre]407 **********************************************************************;
409 libname t (work) ;
NOTE: Libref T was successfully assigned as follows:
Levels: 1
Engine(1): V9
Physical Name(1): C:\Users\PETERC~1\AppData\Local\Temp\SAS Temporary Files\_TD8120
410 data t.name1 ;
411 input name $ value ;
412 list;cards ;

RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+-
413 Jack 123
414 Jill 3210
415 Hill 0
NOTE: The data set T.NAME1 has 3 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.09 seconds
cpu time 0.00 seconds

416 ;
417 data t.name2( index=( name/unique ) ) ;
418 input name $ value ;
419 list;cards ;

RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+-
420 Jill 4321
421 Hill 0
NOTE: The data set T.NAME2 has 2 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.13 seconds
cpu time 0.00 seconds

422 ;
423
424
425 proc append data= name1 base= t.name2 ;*( uniquesave=yes) ;
426 run ;

NOTE: Appending WORK.NAME1 to T.NAME2.
WARNING: Duplicate values not allowed on index name for file NAME2, 2 observations rejected.
NOTE: There were 3 observations read from the data set WORK.NAME1.
NOTE: 1 observations added.
NOTE: The data set T.NAME2 has 3 observations and 2 variables.
NOTE: PROCEDURE APPEND used (Total process time):
real time 0.10 seconds
cpu time 0.03 seconds[/pre]
note the "NOTE: 1 observations added."

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1017 views
  • 0 likes
  • 6 in conversation