BookmarkSubscribeRSS Feed
elwayfan446
Barite | Level 11

Hi,

 

I am trying to build a process that takes a daily file and compares it to the data already stored in a dataset.  I want to create a dataset out of the file and compare it to the main dataset to see if there were any observations that had data changed in any variable.  For example, I have a dataset 3 observations in it.  The variables are ID, Name, Date1, Date2, Amount. 

 

1 Red 5/1/2017, 5/30/2017, $400.00

2 Blue 6/1/2017, 6/30/2017, $500.00

3 Blue 6/1/2017, 7/31/2017, $350.00

 

I get a file that has 1 observation with 2 data fields that are different from my original dataset.

 

2 Blue 6/1/2017, 7/31/2017, $600.00

 

I need to be able to do the following:

 

1. Update the old (primary) dataset with the new data in the updated file.

2. Add new records if they are in the new file but not the old (primary) dataset.

3. Generate a dataset or report that gives the observation, which variables were updated, and what the updates were.

 

I have found this article and it makes pretty good sense.  However, I want to make sure there isn't something built in for this that would make it a little easier.

 

Using PROC COMPARE to identify new and updated ... - PharmaSUG

 

If there is a better way, please let me know.

8 REPLIES 8
Reeza
Super User

I would suggest looking more at the UPDATE statement either in a data step or SQL proc.

 

 

elwayfan446
Barite | Level 11

Thanks, Reeza.  I am using a combination of updates and merge statements to actually update the primary data set but what I am trying to do is create a dataset or report of the variables in observations that have changed each day.  That is the primary focus of this article but it just seemed a little cumbersome and I was hoping there was an easier way.

Reeza
Super User

If you're doing this manually I totally understand, and would also use update/merge. Lazy method:

1. Back up old data

2. Update data

3. Compare new to old via merge

Is there any possibility of including a modified/last updated field in your master dataset that you could then generate your report off?. 

 

I know when I worked with SSIS we were able to automate this, because those fields were updated. I'm assuming SAS DI would have something similar but I really don't know.  

elwayfan446
Barite | Level 11

Yes, I also did something similar with SSIS when I had it available to me.  It has been hard to learn how to do some things in SAS that I was able to do fairly easily with SSIS and SSMS. 

 

I do have a "last updated" field for each observation so I can key off of that.  I am not sure what is different from your method versus the article if I am still going to compare using a merge.  Could you explain that step in a little more detail?  I already keep a copy of the prior day's dataset so I have the ability to compare the one I would update today to the data as it was yesterday.

Reeza
Super User

My point is sadly that proc compare doesn't add any value to the process IMO. It's a proc that could have a ton of value but needs to be updated significantly. 

elwayfan446
Barite | Level 11

I agree with you completely.

 

So, if you were doing it your way... in step three (update/merge) what would you do to compare those two datasets and either output to a new report or dataset the observations that have changes and which variables were changed?

Peter_C
Rhodochrosite | Level 12
Hi @elwayfan466
Have you looked into using an AUDIT table.
This makes most sense if your masterfile is updated like a database - by sql update or data step MODIFY - rather than by data step SET and MERGE
elwayfan446
Barite | Level 11

@Peter_C, I had not heard of an audit trail before but I will definitely look into that.  Right now I am updating my datasets each day with a combination of sql and MERGE but, if using an audit table is more efficient, then I will definitely look into it.  Thank you for the suggestion!

 

I will report back.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1250 views
  • 1 like
  • 3 in conversation