## Assign ID based on set criteria

Occasional Contributor
Posts: 14

# Assign ID based on set criteria

I need help with assigning a new id (proposed id).I'm trying to get the final data below based on the fact that
If Admn_no  have same sch_ID and name, or sch_ID and Birth_Day or name and Birth_day, assign the same the same proposed ID throughout.

The assumption is they are the same people. If that criterion is not met, then give it a new proposed id since the person is likely a new person.

SAMPLE data

1               6116    CALVIN        03/10/1970
2               6176    CALVIN        03/10/1970
3              6176     CALVIN       10/03/1970
4              0176     CALVIN        03/10/1970
5              6176                          10/03/1970
6              6176      MALVIN      03/10/1970
7              6176                          03/10/1970
8                         CALVIN          03/10/1970
9             6116                           03/10/1970
10          12345   JOHN             01/02/1978
11          6543      TOM              03/06/1977
12          2348     CALVIN

Final

1               6116    CALVIN        03/10/1970    1

2               6176    CALVIN        03/10/1970    1

3              6176     CALVIN       10/03/1970     1

4              0176     CALVIN        03/10/1970    1

5              6176                          10/03/1970    1

6              6176      MALVIN      03/10/1970     1

7              6176                          03/10/1970    1

8                         CALVIN          03/10/1970    1

9             6116                           03/10/1970    1

10          12345   JOHN             01/02/1978    2

11          6543      TOM              03/06/1977   3

12          2348     CALVIN                               4

PROC Star
Posts: 8,163

## Re: Assign ID based on set criteria

Since the odds of two students having the same birthdate in a class is extremely high (see, e.g., Math Guy: The Birthday Problem : NPR  ), using that criterion at the school level doesn't appear to be a good way to go.

If you are only trying to find transposition errors, then I would use one of the similarity functions like spedis (SAS(R) 9.2 Language Reference: Dictionary, Fourth Edition ) on all three variables and determine your criterion based on the values you get, but using all three variables in a single combination.

Discussion stats