I have two tables with variables listed below. In the first table origacct number is meant to be original account number and should remain the same. acctnbr can change for several reasons but origacct should not change. During my initial research I have found this is not true. I am using the second table to search history, later I will be using a third but I don't currently have access.
I'll start with what I'm thinking and hopefully someone has a better solution. I will be doing a self join, cons_test = cons_test, joining acctnbr to origacct. Next I will join the three variables in the second table. I need to keep acctnbr, but join to each other variable, I will then sort by opendate to find the oldest account number associated with the current acctnbr. Currently the database doesn't have a good way to view customer history. Is an array the best way to accomplish this, I'm open to any suggestions.
data cons_test;
set consumer_agg(obs=10000);
keep acctnbr origacct opendate;
run;
data fdr_start;
set fdr_agg(obs=10000);
keep cracct xref1 xref2 xrefacct opendate;
run;
Create a master table of accounts to what account number they should be. This is most likely a recursive search, but it's been solved on here before several times. Ideally you don't use a Cartesian product, I think the best solutions were hash tables.
One might hope there are additional pieces of information to identify a person or oganization to help decide if the values really are for the same person or typos. Also I might look at the earliest dates associated with any of the information combinations as well for determining a the "true" origacct value.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.