Hi, all--
I will outline my goal, & then where I stand.
My goal:
Loop through CSV files, track
changes from one to the next (ie. keep a running table of changes from one file to the next (any additions/removals/changes) and when they occurred)
Here's what I have:
My excel CSV lists:
Name | Color | Weight |
Rob | Red | 150 |
Dean | Blue | 160 |
Drew | Green | 165 |
Ben | Yellow | 170 |
Name | Color | Weight |
Rob | Red | 145 |
Dean | Blue | 160 |
Drew | Green | 165 |
Sam | Purple | 145 |
Name | Color | Weight |
Rob | Red | 145 |
Drew | Green | 165 |
Sam | Purple | 150 |
George | Indigo | 135 |
These are my 3 lists, in order
This is what I want to create:
Name | Color_old | Weight_old | Color_new | Weight_new | Change Occurred |
Rob | Red | 150 | Red | 145 | List 2 |
Ben | Yellow | 170 | List 2 | ||
Sam | Purple | 145 | List 2 | ||
George | Indigo | 135 | List 3 | ||
Dean | Blue | 160 | List 3 | ||
Sam | Purple | 145 | Purple | 150 | List 3 |
This shows me any changes & when they occurred
This is my current code:
data want;
merge
ds1 (rename=(color=color_old weight=weight_old))
ds2 (rename=(color=color_new weight=weight_new))
;
by name;
if color_old ne color_old or weight_old ne weight_new;
run;
It will get me the changes from list 1 to list 2. Now, what is the best way to build a process that will allow me to continually import new CSVs to check for changes? Is there a way for me to determine when the change was made? The code above does not give me that. Is there a way to add these changes to a running table?
Thanks in advance, your help is much appreciated!!
So do you want to track CSV files or something that has been entered into Excel? Not the same thing.
I don't actually see any reference to any CSV file anywhere, so the question of what is intended here is pretty important.
I would expect something along the lines of 1) read the CSV files (is this a fixed list of file names or does it change?) into SAS data sets; 2) compare the result with an existing data set to make that report format.
You should really include at least one example of some thing changes more than one time and what the result looks like. You might find working through that example brings up other issues about this process. Since you are saying "track changes" and "running table of changes" you may have to consider what the result looks like for 10 or 20 changes.
You may also have to go into logic about why NOT appearing in a followup set is reported as a change
Sorry for any confusion:
1) Will have CSV files in a folder that all follow similar naming conventions & all follow similar format
2) Mass import into SAS as tables
3) Identify changes from successive tables (changes from 1 to 2, 2 to 3, 3 to 4 & so on...)
4) The more recent file will always be the 'source of truth'
5) Create a table that includes all changes
The 3 types of changes:
Additions -- on file 2, not on file 1
Removals -- on file 1, not on file 2
Adjustment -- name is on both, but weight has been adjusted
You should really include at least one example of some thing changes more than one time and what the result looks like -- this will always just be based on 2 files.
Does this help answer the questions?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.