Hello
I have a two extra large data sets (>9Million records) that I'm am trying to compare. The data sets have over 300 columns in them.
I was able to run a proc compare that outputs the differences only by ID, however, because of the size of the data set and the restrictions on exporting at my company, I cannot get the data out to analyze on a record by record basis which columns are not matching.
Is there an easy way in SAS that I can only show the variables (by the ID key) that are different? Any other suggestions on how I can view only the differences in the individual columns?
Thanks
I can view the output fine. I just need the ability to identify the differences easily. Currently there are over 400,000 rows and in the 350+ columns there could be differences. I'm just trying to identify, by the ID which records in which columns have differences.
If I could export, it would be easier to analyze, but I can't so I'm trying to figure out a way to just output by ID the column where the dif=XXX.
Yes, I'm outputting it to a data set. The problem is because there is over 400,000 records and 350+ columns, I don't know the best way to find the differences. How do I easily identify, by the Primary key, which columns have the differences without scrolling thru 400,000 rows and 350 columns.
Yes, I have outdif and outnoequal in my query. There are actually 400,000 difs in the data.
proc compare base=recs_a compare=recs_b
out=result outnoequal outbase outcomp outdif
noprint;
id POLICY_NUMBER TERM_NUMBER POLICY_RISK_IDENTIFIER EFFECTIVE_FROM EFFECTIVE_TO;
run;
These are actual differences. The two data sets need to be identical. It's a new view that was created from an old one, and we are testing that the landed data is matching as the business expects it to.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.