02-24-2012 02:05 AM
i have dataset A with some repeated observations in ACCOUNTNO variable,dataset B also have duplicate observations ,boath datasets having common variable ACCOUNTNO .How can we merge boath datasets by removing duplicates using merge.
02-26-2012 02:24 PM
merge a b;
I would assume in an interview situation you should also ask what's meant by duplicates - duplicate keys or duplicate rows (=all variables having the same values).
Above code snippet is for duplicate rows.
02-27-2012 08:16 AM
If you are asked that question in an interview (or real life) you need to respond with some questions so you can figure out what they want.
Do they want to match every obversation in A with every observation in B that has the same account number?
What about account numbers that only occur in A or B?
Do they want to pair them up in order within the account numbers?
What happens if there are not the same number of observations in both A and B for a particular account number.
Once you have the answers to those questions you can begin to build a strategy for "merging" them.
But in general SQL will be much more useful language for coding that than a data step using the MERGE statement.