I'm working with a prenatal care dataset from a developing country. The data were abstracted from paper charts. Patient visits were recorded on a visit-by-visit basis, so there's no "patient file" per se. In this country, it's okay if names are spelled slightly differently, as long as it's in the ballpark phonetically. That means I now have the task of trying to link all these patient prenatal care records longitudinally while not having a consistent identifier. Here's what I do have: For identifiers, I have their first names, middle names, last names, village, age (in years, no birthdate), last menstrual period date (LMP), expected delivery date (EDD), and parity. The problem is that no one identifier is consistently right. How do I sort these patients out and assign them a subject ID? Thanks so much for your advice.
... View more