Repair the data : Mapping the latest data into old data.

Lohia · Posted 06-24-2016 05:34 AM

Hi Guys,

I have attached a test data file. I need to map the latest infromation to old data. I have millions of data and which have different level of duplicates in it.

Do we have code which can help me in this?

Thanks!

Parveen

RW9 · Posted 06-24-2016 05:39 AM

Yes, SAS code can help. However you have provided exactly no information on what software you are using, what systems, what the process of "updating" will be, what to do with duplicates, missing, no test data (in the form of a datastep) or required output. The only thing I can suggest is looking at that preview, you may be able to update the main dataset with the changes dataset using the update datastep command, which you can find here:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000202975.htm

Lohia · Posted 06-24-2016 06:08 AM

Hi RW9,

I use SAS 9.4 and also have SAS EG . Solution you suggested me is very basic one and sorry i did not expained the problem in brief way.

But i have insurance data , which contain different level of duplicates like

a person having unique NPI(national provider identification number) , but this person have multiple records in our database.

At the time of recording the data in table , person updated with different level of naming like

NPI First_Name Last_Name Middle_Name State Address

123456789 Andrew Jonatha R CA Mark Hospital INC

123456789 Jonathan Andrew R CA Mark Hospital, INC

123456789 Andrew CA Mark Hospital CA

123456789 Andrew R Jonathan CA Mark INC

123456789 Andrew Jona R CA Mark Hospital

123456789 Andrew J R CA Mark Hospital INC

As you can see 6 rows are there , one is unique here and i need to repair this record uning right information.

Thats why we are taking help of original data present at national NPI . using this we need to map a right record and then we will remove the duplicates uning NPI+FN+LN+ST+Address.

If you still need more clarifucation please let me know 🙂 .

Thanks!

Parveen

RW9 · Posted 06-24-2016 08:29 AM

So what your saying is you have the data below, and from somwhere else you have the correct data:

123456789 Jonathan Andrew R CA Mark Hospital, INC

So why not just drop those variables from the data below, and merge the correct data item back on?

Ksharp · Posted 06-25-2016 05:02 AM

Simple MERGE statement + PROC SORT can do that;

data want;
 merge old_data lastest_data;
 by NPI;
run;

proc sort nodupkey ;by NPI ;run;

Lohia · Posted 06-25-2016 01:46 PM

Hi RW9 ,

Yes , you are right , but in the latest data i don't have all the NPIs present.

Latest data have only 2000 NPIs which are correct i need to repair the 12k NPI(Unique NPIs) in old data.

HI Ksharp,

Thanks for your input.

SImple merge and sort would not help in this problem.

I am taking latest data as a reference only for 2k good records after this i have remaining 12k unique NPI which needs to get repaired.

Challege is , a single person is working as individual and in a hospital as group , all have there own style of updating the data.

thats cause huge problem.

One thing i would like to ask is ; do we have any alorithem in SAS EG or any specific process inbuilt in EG , so that i can handle this problem.

Things i am trying is :

Sound like - in data we can have same named people with same city and Zip and also they eventully works in same hospital.

so, for this i need to try something else like to get a flag generated for this , so that i can figure out real one.

Thanks!

Ksharp · Posted 06-26-2016 01:01 AM

So what is your rules? How do we know what is right information, what is wrong information ?

Repair the data : Mapping the latest data into old data.

Re: Repair the data : Mapping the latest data into old data.

Re: Repair the data : Mapping the latest data into old data.

Re: Repair the data : Mapping the latest data into old data.

Re: Repair the data : Mapping the latest data into old data.

Re: Repair the data : Mapping the latest data into old data.

Re: Repair the data : Mapping the latest data into old data.

Repair the data : Mapping the latest data into old data.

Re: Repair the data : Mapping the latest data into old data.

Re: Repair the data : Mapping the latest data into old data.

Re: Repair the data : Mapping the latest data into old data.

Re: Repair the data : Mapping the latest data into old data.

Re: Repair the data : Mapping the latest data into old data.

Re: Repair the data : Mapping the latest data into old data.

Registration is open

SAS Training: Just a Click Away