I used the following code to create a test data set: data test0; input ID1 ID2 date score; datalines; 1 1.1 2004 8 1 1.1 2004 7 1 1.1 2004 1 2 1.2 2005 1 2 1.2 2006 1 2 1.2 2007 1 2 2.2 2005 8 2 2.2 2006 8 2 2.2 2007 8 3 3.1 2005 5 3 3.2 2005 6 3 3.3 2005 5 3 3.1 2006 5 3 3.2 2006 6 3 3.3 2006 5 3 3.1 2007 5 3 3.2 2007 6 3 3.3 2007 5 4 4.1 2005 8 4 4.1 2006 8 4 4.1 2007 8 5 5.1 2005 5 5 5.2 2006 6 5 5.3 2007 5 ; I want to test the presence of duplicate observations in the data sets. The rule is ID1 (primary indicator) should be present only once for each date. For Example, ID1 = 2 and ID1 = 3 have duplicate observations as they have repeatations of the same value for ID1 for one particular value of date. However, ID1 =5 does not have a duplicate observation, although it's secondary indicator (ID2) changes it's value across dates. Any help on this matter is highly appreciated. I want to create an indicator variable that will take a value 1 for if a particular observation is a duplicate observation.
... View more