BookmarkSubscribeRSS Feed
Lohia
Calcite | Level 5

Hi Guys,

 

I have attached a test data file. I need to map the latest infromation to old data. I have millions of data and which have different level of duplicates in it.

 

Do we have code which can help me in this?

 

Thanks!

Parveen

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Yes, SAS code can help.  However you have provided exactly no information on what software you are using, what systems, what the process of "updating" will be, what to do with duplicates, missing, no test data (in the form of a datastep) or required output.  The only thing I can suggest is looking at that preview, you may be able to update the main dataset with the changes dataset using the update datastep command, which you can find here:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000202975.htm

Lohia
Calcite | Level 5

Hi RW9,

 

I use SAS 9.4 and also have SAS EG . Solution you suggested me is very basic one and sorry i did not expained the problem in brief way.

 

But i have insurance data , which contain different level of duplicates like 

 

a person having unique NPI(national provider identification number) , but this person have multiple records in our database.

 

At the time of recording the data in table , person updated with different level of naming like

NPI                First_Name  Last_Name    Middle_Name  State     Address                     

123456789    Andrew           Jonatha         R                    CA     Mark Hospital INC

123456789   Jonathan         Andrew          R                    CA     Mark Hospital, INC

123456789    Andrew                                                       CA     Mark Hospital CA

123456789    Andrew R       Jonathan                              CA     Mark INC

123456789    Andrew           Jona              R                    CA     Mark Hospital 

123456789    Andrew  J                             R                    CA     Mark Hospital INC

 

As you can see 6 rows are there , one is unique here and i need to repair this record uning right information.

 

Thats why we are taking help of original data present at national NPI . using this we need to map a right record and then we will remove the duplicates uning NPI+FN+LN+ST+Address.

 

If you still need more clarifucation please let me know 🙂 .

 

Thanks!

Parveen

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

So what your saying is you have the data below, and from somwhere else you have the correct data:

123456789   Jonathan         Andrew          R                    CA     Mark Hospital, INC

 

So why not just drop those variables from the data below, and merge the correct data item back on?

Ksharp
Super User
Simple MERGE statement + PROC SORT can do that;

data want;
 merge old_data lastest_data;
 by NPI;
run;

proc sort nodupkey ;by NPI ;run;

Lohia
Calcite | Level 5

Hi RW9 ,

 

Yes , you are right , but in the latest data i don't have all the NPIs present.

 

Latest data have only 2000 NPIs which are correct i need to repair the 12k NPI(Unique NPIs)  in old data.

 

HI Ksharp,

 

Thanks for your input.

 

SImple merge and sort would not help in this problem.

 

I am taking latest data as a reference only for 2k good records after this i have remaining 12k unique NPI which needs to get repaired.

 

Challege is , a single person is working as individual and in a hospital as group , all have there own style of updating the data.

 

thats cause huge problem.

 

One thing i would like to ask is ; do we have any alorithem in SAS EG or any specific process inbuilt in EG , so that i can handle this problem.

 

Things i am trying is :

 

Sound like - in data we can have same named people with same city and Zip and also they eventully works in same hospital.

so, for this i need to try something else like to get a flag generated for this , so that i can figure out real one. 

 

Thanks!

 

Ksharp
Super User
So what is your rules? How do we know what is right information, what is wrong information ?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1075 views
  • 0 likes
  • 3 in conversation