09-08-2016 04:55 PM
I have to merge two datasets based on one variable(test). The values in the variables mean same thing but the values are not exactly similar. EG; EXCL01=EX1.
Please help. Thanks
ID test VALUE
1 EXCL01 A
2 EXCL11 C
3 EXCL07 D
4 INCL01 E
5 INCL03A J
6 INCL03B H
7 INCL22 H
09-08-2016 05:03 PM
The term is fuzzy matching. But you have to have some rules. Programs basically implement rules
So how do you want to match? What's a close enough match? Does it matter how long the strings are?
Look at compged/spedis/sounds like operators for matching and distances between strings.
If you search fuzzy matching at lexjansen.com you'll find a lot of different approaches.
09-08-2016 05:11 PM
It looks relatively straightforward to add a third variable to DATA1, an abbreviated version of TEST that would match the values found in TEST within DATA2. Then sort and merge by that new variable.