BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Asli_A
Fluorite | Level 6

Hello all, 

I am doing data cleaning on a customer data. My goal is to select the 'survivor ' of the customer data based on data source, reliability and recency. To give an example I need to assign one survivor Identitynum to each Newid based on the information that Data source '0' is more reliable than '1', similarly Suspicious '0' is more reliable than '1', and the biggest ID is the most reliable than the others because it is the most recent. I need to write a code which will transform the table1 to the table2.

IDNewidIdentitynumData SourceSuspicious
111384900
139245711384410
125011891457211
991753210
3792791753210
1706203102903711
1733103745211
34609104444511

Table1

 

IDnewidIdentitynumData SourceSuspiciousSurvivor
1113849001
37927917532101
17062031029037111

Table2

 

Thank you!

 

PS:Assigning the survivor based on 'suspicious' and recency when max(ID) is not equal to the latest non-suspicious(suspicious=0)  ID in the same newid group is the main problem.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@Asli_A wrote:

Hello all, 

I am doing data cleaning on a customer data. My goal is to select the 'survivor ' of the customer data based on data source, reliability and recency. To give an example I need to assign one survivor Identitynum to each Newid based on the information that Data source '0' is more reliable than '1', similarly Suspicious '0' is more reliable than '1', and the biggest ID is the most reliable than the others because it is the most recent. I need to write a code which will transform the table1 to the table2.

ID Newid Identitynum Data Source Suspicious
1 1 13849 0 0
1392457 1 13844 1 0
1250118 9 14572 1 1
9 9 17532 1 0
37927 9 17532 1 0
1706203 10 29037 1 1
1733 10 37452 1 1
34609 10 44445 1 1

Table1

 

ID newid Identitynum Data Source Suspicious Survivor
1 1 13849 0 0 1
37927 9 17532 1 0 1
1706203 10 29037 1 1 1

Table2

 

Thank you!

 

PS:Assigning the survivor based on 'suspicious' and recency when max(ID) is not equal to the latest non-suspicious(suspicious=0)  ID in the same newid group is the main problem.


If I am understanding the question is basically: get the records into a desired order and select the correct one.

This might give you a start:

proc sort data=have;
  by newid datasource suspicious descending id;
run;

data want;
   set have;
   by newid;
   if first.newid;
run;

View solution in original post

2 REPLIES 2
ballardw
Super User

@Asli_A wrote:

Hello all, 

I am doing data cleaning on a customer data. My goal is to select the 'survivor ' of the customer data based on data source, reliability and recency. To give an example I need to assign one survivor Identitynum to each Newid based on the information that Data source '0' is more reliable than '1', similarly Suspicious '0' is more reliable than '1', and the biggest ID is the most reliable than the others because it is the most recent. I need to write a code which will transform the table1 to the table2.

ID Newid Identitynum Data Source Suspicious
1 1 13849 0 0
1392457 1 13844 1 0
1250118 9 14572 1 1
9 9 17532 1 0
37927 9 17532 1 0
1706203 10 29037 1 1
1733 10 37452 1 1
34609 10 44445 1 1

Table1

 

ID newid Identitynum Data Source Suspicious Survivor
1 1 13849 0 0 1
37927 9 17532 1 0 1
1706203 10 29037 1 1 1

Table2

 

Thank you!

 

PS:Assigning the survivor based on 'suspicious' and recency when max(ID) is not equal to the latest non-suspicious(suspicious=0)  ID in the same newid group is the main problem.


If I am understanding the question is basically: get the records into a desired order and select the correct one.

This might give you a start:

proc sort data=have;
  by newid datasource suspicious descending id;
run;

data want;
   set have;
   by newid;
   if first.newid;
run;
Asli_A
Fluorite | Level 6
Yes! This was what i needed. Thank you 🙂

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 625 views
  • 1 like
  • 2 in conversation