I have data that is related via a many to many relationship and I want to find all items that are related to each other and give them a unique id. My example data is listed below: PolicyId PersonId 1 1000 1 1001 2 1001 3 1000 4 1002 2 1003 5 1003 4 1004 My desired output would either be PolicyId PersonId GroupId 1 1000 1 1 1001 1 2 1001 1 3 1000 1 4 1002 2 2 1003 1 5 1003 1 4 1004 2 or PolicyID GroupId 1 1 2 1 3 1 4 2 5 1 PersonId GroupId 1000 1 1001 1 1002 2 1003 1 1004 2 Do you have any suggestions on how to do this efficiently? I have managed to code this using a Macro, which for each policy, finds all related persons, then for each person, recursively calls the macro again for all related policies (that have not been assigned a GroupId already). This works ok for a small number of values, but I have 1.9m observations in my dataset, so it performs very slowly. There a some performance gains by calculating the cases where they is only one-to-many relationship between Policy and Person in a separate step, i.e. Policy 4 above would not be passed to the macro. This means the number I'm checking reduces to about 174K.
... View more