05-29-2015 09:32 AM
Is there a way to use SAS to compare entries in two excel files and look for mismatches? The mismatches would appear as a new row of data. These would be for entries added later after my original excel extract at the end of each month. It takes about an additional month for the books to officially close for each month, and I want to make sure there were no additions after my original extract. Thanks for your help!
05-29-2015 09:42 AM
Firstly, why use Excel for the data at all? Data should be stored in a database or datawarehouse, that is what those applications are built for. Excel is not built for this (or pretty much anything else).
If however you absolutely have to have Excel files, then why not use VBA which is built into Excel.
If you are still instisting on doing thing the hard way, then you would need to import each datafile - i.e. write a datastep import for each. Then you could run a proc compare, or data merge, then output the result to Excel again.
Seriously though, if you do things correctly and use a database, then the audit trail from the database will tell you what has changed, all the above is just additional work for no gain. There are also Excel file compare programs available for free online.
05-29-2015 12:45 PM
Thanks. I imported my two files and used proc compare. How do I export the observations that are different? There is a warning message in my log that correctly states there are 156 observations in one data set but not the other. I cannot figure out how to export these 156 observations.
05-29-2015 01:20 PM
The proc compare has many options: https://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473377.htm
One of those is to create an output dataset with differences. You can then export that dataset.
From my side though I would find it easier to identify id variables, then merge the two datasets by those variables. Then output the results.
05-29-2015 04:24 PM
I would like to be able to export only those rows of data that are in the new file but not the old. Maybe this is not possible. I'm new to this and am not sure what I'm doing so thanks for your patience.
proc compare base=old compare=new outdif out=result;
I'm not for sure exactly what my table named result is showing me, but it's definitely not the new rows of data.
05-29-2015 05:02 PM
Proc compare basically does a row by row comparison. If the two sets are in a different order then then you get lots of differences.
To effectively use Proc compare for what I think you are doing you may need to sort both datasets by the same variables. If there isn't an obvious set of variables that will uniquely identify a record, this may mean sorting by practically every variable in the data.
Or possibly this might work:
create table want as
select * from table2
select * from table1;
Where table2 is the set name of the new file. If the variables differ, not of the same type with the same name or different numbers of variables this may not work though.