Thanks @Shmuel for your reply. That particular code does not work well in this setting since it just looks for industries that have at least two IDs whereas now I am trying to analyze whether a manager owns an ID in a given industry and the other manager holds a different ID from the same industry. In the current setting, I am trying to determine: Within a given group, do both managers have different IDs in a given industry? If yes, then Control=1 for those identified rows. However, the tricky part is that if a manager has an identical ID as the other manager but has no other IDs within the same industry, then Control=0 for all of the IDs of that industry (i.e., Control=0 for all of these identified rows that are in the same industry). For instance, Manager 1 may have ID 1234 and Manager 2 may have 1234 and 2341 where all of these IDs are in the same industry, then Control=0 for all of these rows. But if Manager 1 had 1234 and Manager 2 had only 2341, then Control=1. Or, if Manager 1 would have had 1234 and 2341, whereas Manager 2 had 4251 and 4159 (where all of these IDs are in the same industry), then Control=1. The simple case is that Control=1 whenever Manager 1 has completly different IDs compared to Manager 2 where both Managers are in the same Group. There is only one exception though making the case a little bit tricky, which is if there are at least two identical IDs (e.g., she owns 2351, 8451, and 9999) owned by a given Manager that the other Manager also owns in addition to other IDs (e.g., 2351, 8451, and 8521), then all rows get a Control=. This is the next case. Within a given group, does one of the managers have at least two IDs which are in the same industry that the other manager also owns (i.e., do both managers have at least two identical IDs that are in the same Industry)? If yes, then Control=1 for those identified rows. Else, Control=0. Here is what the correct output looks like, I added comments about the cases described above next to the Controls: DATA want; INPUT Mgrno Group ID Industry Control; DATALINES; 9999 1 00462610 1842 0 9999 1 01737210 1842 0 9999 1 12690310 2832 0 8888 1 12690310 2832 0 8888 1 12690311 2832 0 8888 1 37158620 2124 0 9999 1 23298321 4238 1 9999 1 53228424 4238 1 8888 1 53228424 4238 1 8888 1 23298321 4238 1 9999 1 12290311 1233 1 8888 1 01737211 1233 1 8888 1 99998881 1233 1 7777 2 12690310 2832 1 3333 2 32690311 2832 1 7777 2 37158620 2124 0 7777 2 38138222 3321 1 3333 2 44134422 3321 1 7777 2 53228424 4238 1 3333 2 77233421 4238 1 3333 2 12290311 1233 0 3333 2 93234320 1333 0 ; RUN; UPDATED: Please see below a screenshot with the comments. When I past the code here, it becomes messy and not readable. Here is the sample which is the same one above except that I removed the control and pair columns: DATA have;
INPUT Mgrno Group ID Industry ;
DATALINES;
9999 1 00462610 1842
9999 1 01737210 1842
9999 1 12690310 2832
8888 1 12690310 2832
8888 1 12690311 2832
8888 1 37158620 2124
9999 1 23298321 4238
9999 1 53228424 4238
8888 1 53228424 4238
8888 1 23298321 4238
9999 1 12290311 1233
8888 1 01737211 1233
8888 1 99998881 1233
7777 2 12690310 2832
3333 2 32690311 2832
7777 2 37158620 2124
7777 2 38138222 3321
3333 2 44134422 3321
7777 2 53228424 4238
3333 2 77233421 4238
3333 2 12290311 1233
3333 2 93234320 1333
;
RUN;
... View more