Hi Patrick,
Just wanted to thank you again for that code - it works on my test sample! 🙂
Took me awhile to understand what's going on in there but I think I kind of get it now. I made a few mods, I made the possible duplicates table to be scores between .00001 and 20 as opposed to 0 and 20 since a 0 would basically be the exact title match later on in SPEDIS. As we are assuming if they match exactly on Volume, Issue, PubYear, etc. they are one and the same, so I give them a score of .00001 if everything else is the same but the ArticleTitle. I only want to examine the ones that have a difference in spelling variation in Article, so I added a piece of code before that part to assign a 0 score if ArticleTitle = _ArticleTitle. The SPEDIS takes care of the remainder.
An example of the nice result of your code is the following duplication detection, note the minor variations in article spelling, but the very close match on the other variables, the combination of such in the concatentation within your SPEDIS function resulted in a score of 4. Here's a very small partial print of what I am seeing:
In order: ArticleTitle, PubYear, Volume, Issue
HOW "DEUTSCH" A REQUIEM? ABSOLUTE MUSIC, UNIVERSALITY, AND THE RECEPTION OF BRAHMS'S "EIN DEUTSCHES REQUIEM," OP. 45 1998 1 3
HOW DEUTSCH A REQUIEM? ABSOLUTE MUSIC, UNIVERSALITY, AND THE RECEPTION OF BRAHMS'S EIN DEUTSCHES REQUIEM, OP. 45 1998 1 3-19
There are some instances of false positives due to the way the article title comes in:
FACULTY POSITIONS AS A CAREER CHOICE FOR PROFESSIONALS--PART II 1991 4 329
FACULTY POSITIONS AS A CAREER CHOICE FOR PROFESSIONALS--PART II 1991 4 329
FACULTY POSITIONS AS A CAREER CHOICE FOR PROFESSIONALS--PART I 1991 3 202
FACULTY POSITIONS AS A CAREER CHOICE FOR PROFESSIONALS--PART I 1991 3 202
I'm thinking of somehow maybe creating a last word variable for the article to get around this situation and perhaps overpenalize non-matches to alleviate the issue.
Still have a long way to go but thank you for getting me started! :)
Message was edited by: Hans
... View more