DATA Step, Macro, Functions and more

Repair the data : Mapping the latest data into old data.

Reply
Occasional Contributor
Posts: 13

Repair the data : Mapping the latest data into old data.

Hi Guys,

 

I have attached a test data file. I need to map the latest infromation to old data. I have millions of data and which have different level of duplicates in it.

 

Do we have code which can help me in this?

 

Thanks!

Parveen

Super User
Super User
Posts: 7,430

Re: Repair the data : Mapping the latest data into old data.

Yes, SAS code can help.  However you have provided exactly no information on what software you are using, what systems, what the process of "updating" will be, what to do with duplicates, missing, no test data (in the form of a datastep) or required output.  The only thing I can suggest is looking at that preview, you may be able to update the main dataset with the changes dataset using the update datastep command, which you can find here:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000202975.htm

Occasional Contributor
Posts: 13

Re: Repair the data : Mapping the latest data into old data.

Hi RW9,

 

I use SAS 9.4 and also have SAS EG . Solution you suggested me is very basic one and sorry i did not expained the problem in brief way.

 

But i have insurance data , which contain different level of duplicates like 

 

a person having unique NPI(national provider identification number) , but this person have multiple records in our database.

 

At the time of recording the data in table , person updated with different level of naming like

NPI                First_Name  Last_Name    Middle_Name  State     Address                     

123456789    Andrew           Jonatha         R                    CA     Mark Hospital INC

123456789   Jonathan         Andrew          R                    CA     Mark Hospital, INC

123456789    Andrew                                                       CA     Mark Hospital CA

123456789    Andrew R       Jonathan                              CA     Mark INC

123456789    Andrew           Jona              R                    CA     Mark Hospital 

123456789    Andrew  J                             R                    CA     Mark Hospital INC

 

As you can see 6 rows are there , one is unique here and i need to repair this record uning right information.

 

Thats why we are taking help of original data present at national NPI . using this we need to map a right record and then we will remove the duplicates uning NPI+FN+LN+ST+Address.

 

If you still need more clarifucation please let me know :-) .

 

Thanks!

Parveen

 

Super User
Super User
Posts: 7,430

Re: Repair the data : Mapping the latest data into old data.

So what your saying is you have the data below, and from somwhere else you have the correct data:

123456789   Jonathan         Andrew          R                    CA     Mark Hospital, INC

 

So why not just drop those variables from the data below, and merge the correct data item back on?

Super User
Posts: 9,691

Re: Repair the data : Mapping the latest data into old data.

Simple MERGE statement + PROC SORT can do that;

data want;
 merge old_data lastest_data;
 by NPI;
run;

proc sort nodupkey ;by NPI ;run;

Occasional Contributor
Posts: 13

Re: Repair the data : Mapping the latest data into old data.

Hi RW9 ,

 

Yes , you are right , but in the latest data i don't have all the NPIs present.

 

Latest data have only 2000 NPIs which are correct i need to repair the 12k NPI(Unique NPIs)  in old data.

 

HI Ksharp,

 

Thanks for your input.

 

SImple merge and sort would not help in this problem.

 

I am taking latest data as a reference only for 2k good records after this i have remaining 12k unique NPI which needs to get repaired.

 

Challege is , a single person is working as individual and in a hospital as group , all have there own style of updating the data.

 

thats cause huge problem.

 

One thing i would like to ask is ; do we have any alorithem in SAS EG or any specific process inbuilt in EG , so that i can handle this problem.

 

Things i am trying is :

 

Sound like - in data we can have same named people with same city and Zip and also they eventully works in same hospital.

so, for this i need to try something else like to get a flag generated for this , so that i can figure out real one. 

 

Thanks!

 

Super User
Posts: 9,691

Re: Repair the data : Mapping the latest data into old data.

So what is your rules? How do we know what is right information, what is wrong information ?

Ask a Question
Discussion stats
  • 6 replies
  • 360 views
  • 0 likes
  • 3 in conversation