I found a lot of posts about assigning group IDs, but all of them seemed a little different from what I'm looking for. I have a dataset that includes individuals and their family members - some of these family members are included in the same study (and have their own unique study ID and information about their relationship) as well as some information on some other family members not included in the study (and do not have a unique ID but have their relationship to those in the study). I'm wanting to assign a group ID for each family. The have data looks like this - where relativeID indicates the relatives ID if they are included in the study, relativerelationship is their relationship to the ID case, and IDs within a family aren't necessarily sequential. ID RelativeID RelativeRelationship 10001 . Grandfather 10001 10010 Mother 10001 12200 Brother 10010 . Father 10010 10001 Son 10010 12200 Son 11100 33000 Mother 12200 . Grandfather 12200 10001 Brother 12200 10010 Mother 33000 11100 Son The want data would look like this. I'm wanting to assign an ID for each unique family group, including for those who do not have a relativeID: ID RelativeID RelativeRelationship FamilyID 10001 . Grandfather 111 10001 10010 Mother 111 10001 12200 Brother 111 10010 . Father 111 10010 10001 Son 111 10010 12200 Son 111 11100 33000 Mother 222 12200 . Grandfather 111 12200 10001 Brother 111 12200 10010 Mother 111 33000 11100 Son 222 Maybe I'm overthinking things, but it seems like the family members that don't have their own unique RelativeID kind of complicates things. I'm struggling to think of an efficient way to do this. My SQL coding is pretty rusty, but my brain is saying this is probably an issue best solved with SQL, but I could be wrong. Any insight on this would be appreciated.
... View more