BookmarkSubscribeRSS Feed
ari2495
Obsidian | Level 7

hi, i'm using a proc compare. Code below:

proc compare 
base=db1
compare=db2
printall;
title "title of compare";
run;

i got a code error, here is the log:

WARNING: Values of the following 24 variables compare unequal: variable names
WARNING: The data sets db1 and db2 contain unequal values.

i have manually checked some of the values, and it is not true that they are unequal, so i suppose there is a problem. Actually, i have detected a problem with variable names, some were lower case and some other were uppercase so i turned everything uppercase but it did not worked, the log goes on saying some variables are unequal. i have also checked the columns order, they are exactly equal. What could I check to solve the problem? thanks a lot

7 REPLIES 7
ballardw
Super User

How did you manually check values? Note that looking by eye without considering the FORMAT involved may not see very small differences because the number of decimals isn't displayed or is rounded and appears the same.

 

Did you sort the data sets by the same variables to ensure the order is the same? That is one common issue as unless you provide an ID statement to align observations based on values of these variables the comparison is record by record.

 

What does the summary section of the output that describes number of variables variables common to both sets show?

ari2495
Obsidian | Level 7

after running the proc compare statement, i identified the variables which are supposed to have unequal values; then, i appended them through a left join and i applied a filter through a where statement to display cases where variable 1 is not equal no variable 2 and i found that these values were the same. 

Yes, i sorted the two datasets using a sort statement by all variables, so i'm thinking that i could try to format the values also because when i displayed the summary of the compare statement i found out that the maximum difference was 

4.191E-09
Reeza
Super User

Are these numeric variables, if so use the fuzz option.

 

proc compare 
base=db1
compare=db2
printall fuzz=0.001;
title "title of compare";
run;

For variable names, can you post an example of the output where the difference occurs?

ari2495
Obsidian | Level 7
thank you, but neither the fuzz option worked. for the variables name, the difference was simply capital letter vs lower letter, and i solved it by formatting all variable names to capital letters
Reeza
Super User
If the differences are as small as you indicate fuzz should work, but look at Criterion/Method options in PROC COMPARE as well.

ari2495
Obsidian | Level 7
Actually, i've just tried and unfortunately fuzz option did not work. i'll try criterion method options, thanks
SASKiwi
PROC Star

You don't have an ID statement in your PROC COMPARE, so SAS is comparing row-for-row (1 with 1, 2 with 2 etc). If your two datasets are not in the same order, then SAS will report unequal values. 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 6316 views
  • 2 likes
  • 4 in conversation