Hello, I have a huge dataset (>3 million records) with approximately 600,000 individuals (many have more than one visit). First, I would like to be able to generate code to fill in gender and race/ethnicity from one record of an individual to their other records if some records are missing this information. For example, say I have the following records: ID Gender Race/Ethnicity 1 1 3 1 . 3 1 1 . 2 . 1 2 0 1 2 0 . I would like the missing cells to be replaced with information from the other records that are available for that same person. Then, I would like to randomly select one visit that contains complete information (age, gender, race/ethnicity) for each person if they have more than one visit; however if complete information is unavailable, I would still like to select at least one record for each person. I hope this makes sense. Thank you in advance for your thoughts and help on this issue. Kindly, ARR
... View more