I need help with assigning a new id (proposed id).I'm trying to get the final data below based on the fact that
If Admn_no have same sch_ID and name, or sch_ID and Birth_Day or name and Birth_day, assign the same the same proposed ID throughout.
The assumption is they are the same people. If that criterion is not met, then give it a new proposed id since the person is likely a new person.
SAMPLE data
Adm_no Sch_ID NAME Birth_Day
1 6116 CALVIN 03/10/1970
2 6176 CALVIN 03/10/1970
3 6176 CALVIN 10/03/1970
4 0176 CALVIN 03/10/1970
5 6176 10/03/1970
6 6176 MALVIN 03/10/1970
7 6176 03/10/1970
8 CALVIN 03/10/1970
9 6116 03/10/1970
10 12345 JOHN 01/02/1978
11 6543 TOM 03/06/1977
12 2348 CALVIN
Final
Adm_no Sch_ID NAME Birth_Day Proposed_id
1 6116 CALVIN 03/10/1970 1
2 6176 CALVIN 03/10/1970 1
3 6176 CALVIN 10/03/1970 1
4 0176 CALVIN 03/10/1970 1
5 6176 10/03/1970 1
6 6176 MALVIN 03/10/1970 1
7 6176 03/10/1970 1
8 CALVIN 03/10/1970 1
9 6116 03/10/1970 1
10 12345 JOHN 01/02/1978 2
11 6543 TOM 03/06/1977 3
12 2348 CALVIN 4
Since the odds of two students having the same birthdate in a class is extremely high (see, e.g., Math Guy: The Birthday Problem : NPR ), using that criterion at the school level doesn't appear to be a good way to go.
If you are only trying to find transposition errors, then I would use one of the similarity functions like spedis (SAS(R) 9.2 Language Reference: Dictionary, Fourth Edition ) on all three variables and determine your criterion based on the values you get, but using all three variables in a single combination.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.