BookmarkSubscribeRSS Feed
Ayo_sas
Calcite | Level 5

I'm new to SAS and I need help with the scenario below. The goal is to produce the "Desired result" shown below. The code I have below produces the "Result" shown below.I'll prefer the use of data step as supposed to proc sql but I'm open to any solutions.  Thanks

data testB;

set testA;

by tax_id firstname dateofbirth gender zipcode lastname;

retain unique_ID 0;

if (first.tax_id or first.firstname or first.dateofbirth or first.gender or first.lastname) then ID = ID + 1;

keep tax_id firstname dateofbirth gender zipcode lastname ID;

run;

Result

tax_id firstname dateofbirth gender zipcode lastname ID

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Does    2     

          Johns    01/01/1973  Male  99999    Doe      3

1111  John                        Male  99999      Doe      4

1111  John      01/01/1973  Male  99999    Doe       1

1111  John      01/01/1973  Male  99999     Doe      1

2222  Bill      03/01/2012    Male  22222     Smith    5

Desired Result

tax_id firstname dateofbirth gender zipcode lastname ID

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Does    1

          John     01/01/1973  Male  99999    Doe      1

1111  John                        Male  99999    Doe       1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

2222  Bill       03/01/2012   Male  22222   Smith    5

2 REPLIES 2
art297
Opal | Level 21

: Did you show your desired result correctly?  I would have thought that you would want:

Desired Result

tax_id firstname dateofbirth gender zipcode lastname ID

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

2222  Bill        03/01/2012  Male  22222    Smith    5

Ayo_sas
Calcite | Level 5

Thanks for the reply. The ideal desired result is shown below (it's what Art has suggested) but I would like to preserve the originality of the existing/original record. The ID is the newly created variable that is supposed to uniquely identify a person

Desired Result

tax_id firstname dateofbirth gender zipcode lastname ID

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

2222  Bill        03/01/2012  Male  22222    Smith    5

I would like to create an code that will, before assigning a new ID, check the TAX_ID, firstname,dateofbirth,gender, zipcode,lastname based on the following conditions

if TAX_ID (of incoming record)=Tax_ID (of existing record)

and (firstname (of incoming record)=firstname (of existing record)  OR partial match firstname(of incoming record)=firstname (of existing record))

and  dateofbirth (of incoming record)= dateofbirth (of existing record)

and gender (of incoming record)= gender (of existing record)

and lastname (of incoming record)=lastname (of existing record)  OR partial match lastname(of incoming record)=lastname (of existing record)) THEN assign the existing ID; ELSE assign a NEW ID

Note to Art: I'm open to a solution that assumes that the desired output is :

Desired Result

tax_id firstname dateofbirth gender zipcode lastname ID

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

2222  Bill        03/01/2012  Male  22222    Smith    5

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 718 views
  • 0 likes
  • 2 in conversation