BookmarkSubscribeRSS Feed
Steelers_In_DC
Barite | Level 11

Good Afternoon,

I have generated a report, the data comes from a vendor that will send a file whenever there is an update with only records that have been updated.  I have a solution that works as far as I can tell, I want to update the existing file with only things that have changed.  Attached is my solution.  I know this is a common situation and I wanted to put this out here to get opinions, and hopefully find a better solution.

Thank You,

Mark

11 REPLIES 11
ballardw
Super User

Your have a classic base data and transaction data set update;

data updata;

update data delta;

by empid;

run;

stat_sas
Ammonite | Level 13

Hi,

Just to understand in a better way. Why you need this code.

proc sort data=compare;by empid descending period;

data final;

set compare;

by empid;

if first.empid then output;

run;

Steelers_In_DC
Barite | Level 11

I added this to make sure I was only pulling the most recent record.  For the real datasets I'm using there will be a datestamp.  I just added period to sort by the most recent record.

stat_sas
Ammonite | Level 13

In delta file there is only one observation (if updated) for each empid in addition to new empids right?

ballardw
Super User

If the transaction file has multiple records for the same empid then ALL of them are applied.

This is actually a good behavior because not all of the variables may necessarily update at the same time.

It may be easier to see what happens with single records but not required. The master data set should have only one value or combination of the key values though.

stat_sas
Ammonite | Level 13

Thanks for the input . Just trying to understand how update works. Please see below syntax where empid=3124 appears twice and only first observation in the sequence is updated with delta observation in transaction data while other remains the same.

data transaction;

input empid $ one two three four five period;

cards;

1234 1 2 3 4 5 1

2134 2 1 3 4 5 1

3124 3 2 1 4 5 1

3124 4 2 1 4 5 1

4123 4 3 2 1 5 1

5123 5 4 3 2 1 1

;

run;

data delta;

input empid $ one two three four five period;

cards;

3124 5 5 5 5 5 2

6666 1 1 1 1 1 1

;

run;

data updata;

update transaction delta;

by empid;

run;

proc print data=updata;

run;

Reeza
Super User

The master data set should have only one value or combination of the key values though.

The transaction data can have multiples, but not the master table.

stat_sas
Ammonite | Level 13

Thanks Reza - I've only one value in master data set (delta) for empid=3124 and two observations for empid=3124 in transaction table but it updates the first observation in transaction data only.

Reeza
Super User

data updata;

update transaction delta;

by empid;

run;

proc print data=updata;

run;

The MASTER Table as SAS knows it is called transaction. It has multiple empid, 3124.

If you want delta to be the master table flip them around:

data updata;

update delta transaction;

by empid;

run;

Astounding
PROC Star

Mark,

The most difficult part of this is understanding the problem.  More specifically, answers to these questions would make the programming straightforward.

1. In your original file before any updates are applied, do you have just one observation per ID?

2. If not, would you like all of the observations changed for that ID or only the most recent one?

3. Is there a variable on your original data that determines which observation is the most recent for a given ID?

4. If the delta data set contains multiple observations for the same ID, how does that affect the desired result?  (Should all the changes be applied to the most recent observation only for that ID?)

It is conceivable that ballardw's original solution is all that you need ... as long as the answer to question 1 is "Yes".

Tom
Super User Tom
Super User

Are the "delta" records replacements for the original data?  Or actual transactions (as the term is used within SAS by the UPDATE statement)?  There is a major difference with respect to missing values.  Step-by-Step Programming with Base SAS(R) Software

Do you have insertions in your "delta" files? That is can the delta add a brand new ID to the master file?

Do you have deletions in your "delta" files?  That is can the delta specify that a particular ID be removed from the mast file?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 2029 views
  • 1 like
  • 6 in conversation