Hi Guys,
I have attached a test data file. I need to map the latest infromation to old data. I have millions of data and which have different level of duplicates in it.
Do we have code which can help me in this?
Thanks!
Parveen
Yes, SAS code can help. However you have provided exactly no information on what software you are using, what systems, what the process of "updating" will be, what to do with duplicates, missing, no test data (in the form of a datastep) or required output. The only thing I can suggest is looking at that preview, you may be able to update the main dataset with the changes dataset using the update datastep command, which you can find here:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000202975.htm
Hi RW9,
I use SAS 9.4 and also have SAS EG . Solution you suggested me is very basic one and sorry i did not expained the problem in brief way.
But i have insurance data , which contain different level of duplicates like
a person having unique NPI(national provider identification number) , but this person have multiple records in our database.
At the time of recording the data in table , person updated with different level of naming like
NPI First_Name Last_Name Middle_Name State Address
123456789 Andrew Jonatha R CA Mark Hospital INC
123456789 Jonathan Andrew R CA Mark Hospital, INC
123456789 Andrew CA Mark Hospital CA
123456789 Andrew R Jonathan CA Mark INC
123456789 Andrew Jona R CA Mark Hospital
123456789 Andrew J R CA Mark Hospital INC
As you can see 6 rows are there , one is unique here and i need to repair this record uning right information.
Thats why we are taking help of original data present at national NPI . using this we need to map a right record and then we will remove the duplicates uning NPI+FN+LN+ST+Address.
If you still need more clarifucation please let me know 🙂 .
Thanks!
Parveen
So what your saying is you have the data below, and from somwhere else you have the correct data:
123456789 Jonathan Andrew R CA Mark Hospital, INC
So why not just drop those variables from the data below, and merge the correct data item back on?
Simple MERGE statement + PROC SORT can do that; data want; merge old_data lastest_data; by NPI; run; proc sort nodupkey ;by NPI ;run;
Hi RW9 ,
Yes , you are right , but in the latest data i don't have all the NPIs present.
Latest data have only 2000 NPIs which are correct i need to repair the 12k NPI(Unique NPIs) in old data.
HI Ksharp,
Thanks for your input.
SImple merge and sort would not help in this problem.
I am taking latest data as a reference only for 2k good records after this i have remaining 12k unique NPI which needs to get repaired.
Challege is , a single person is working as individual and in a hospital as group , all have there own style of updating the data.
thats cause huge problem.
One thing i would like to ask is ; do we have any alorithem in SAS EG or any specific process inbuilt in EG , so that i can handle this problem.
Things i am trying is :
Sound like - in data we can have same named people with same city and Zip and also they eventully works in same hospital.
so, for this i need to try something else like to get a flag generated for this , so that i can figure out real one.
Thanks!
So what is your rules? How do we know what is right information, what is wrong information ?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.