BookmarkSubscribeRSS Feed
Ayo_sas
Calcite | Level 5

I'm new to SAS and I need help with the scenario below. The goal is to produce the "Desired result" shown below. The code I have below produces the "Result" shown below.I'll prefer the use of data step as supposed to proc sql but I'm open to any solutions.  Thanks

data testB;

set testA;

by tax_id firstname dateofbirth gender zipcode lastname;

retain unique_ID 0;

if (first.tax_id or first.firstname or first.dateofbirth or first.gender or first.lastname) then ID = ID + 1;

keep tax_id firstname dateofbirth gender zipcode lastname ID;

run;

Result

tax_id firstname dateofbirth gender zipcode lastname ID

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Does    2     

          Johns    01/01/1973  Male  99999    Doe      3

1111  John                        Male  99999      Doe      4

1111  John      01/01/1973  Male  99999    Doe       1

1111  John      01/01/1973  Male  99999     Doe      1

2222  Bill      03/01/2012    Male  22222     Smith    5

Desired Result

tax_id firstname dateofbirth gender zipcode lastname ID

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Does    1

          John     01/01/1973  Male  99999    Doe      1

1111  John                        Male  99999    Doe       1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

2222  Bill       03/01/2012   Male  22222   Smith    5

2 REPLIES 2
art297
Opal | Level 21

: Did you show your desired result correctly?  I would have thought that you would want:

Desired Result

tax_id firstname dateofbirth gender zipcode lastname ID

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

2222  Bill        03/01/2012  Male  22222    Smith    5

Ayo_sas
Calcite | Level 5

Thanks for the reply. The ideal desired result is shown below (it's what Art has suggested) but I would like to preserve the originality of the existing/original record. The ID is the newly created variable that is supposed to uniquely identify a person

Desired Result

tax_id firstname dateofbirth gender zipcode lastname ID

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

2222  Bill        03/01/2012  Male  22222    Smith    5

I would like to create an code that will, before assigning a new ID, check the TAX_ID, firstname,dateofbirth,gender, zipcode,lastname based on the following conditions

if TAX_ID (of incoming record)=Tax_ID (of existing record)

and (firstname (of incoming record)=firstname (of existing record)  OR partial match firstname(of incoming record)=firstname (of existing record))

and  dateofbirth (of incoming record)= dateofbirth (of existing record)

and gender (of incoming record)= gender (of existing record)

and lastname (of incoming record)=lastname (of existing record)  OR partial match lastname(of incoming record)=lastname (of existing record)) THEN assign the existing ID; ELSE assign a NEW ID

Note to Art: I'm open to a solution that assumes that the desired output is :

Desired Result

tax_id firstname dateofbirth gender zipcode lastname ID

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

1111  John      01/01/1973  Male  99999    Doe      1

2222  Bill        03/01/2012  Male  22222    Smith    5

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1078 views
  • 0 likes
  • 2 in conversation