Hi,
I have two datafiles which actually have the same set of variables but the set of vars is named differently across the two files. Each person should be found in both files and should (but will not always) have identical sets of values across each variable:
File 1:
ID_1 var1 var2
a 1 d
b 2 e
c 3 f
d 4 g
e 5 i
File 2:
person_ID firstvar secondvar
a 1 d
b 2 e
c 3 f
d 4 h
e 6 i
What I need to do is:
1. Look across the values of each variable for each person and make sure that the value of the var on File1 matches the value of the var on File2 for that person.
2. Produce output which gives a list of IDs:
a) That have at least one set of vars which failed to match
b) For those IDs, what were the variables that did not match, eg:
ID non-matching vars
d var2, secondvar
e var1, firstvar
If an ID has multiple sets of non-matching variables, my preference would be for a new row for each mismatch for a given ID.
What is the most efficient way to approach this? Obviously I know PROC COMPARE is built for the comparison part, but the output isn't exactly ideal for what I'm trying to produce.
Hello,
Why can't you use Proc Compare for this?
Many thanks,
Kriss
Perhaps you could also use Proc SQL with the full join, and so you will see the non matching values on different rows...
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.