Help using Base SAS procedures

Merge, not delete, duplicates.

Reply
Contributor
Posts: 24

Merge, not delete, duplicates.

Hello,

I have a data set that has a large about of information between two systems. The observations in the data include a Name, an ID1, ID2, Cat1, Cat2, Cat3, Date in System1, Date in System2.

I expect all entries to have dates in both systems. Right now, each "Name" is listed twice:

NameID1ID2Cat1Cat2Cat3DateSys1DateSys2
Adam10012002ABC11Jan2015.
Adam10012002ABC.12Jan2015

What I would like to see is:

NameID1ID2Cat1Cat2Cat3DateSys1DateSys2
Adam10012002ABC11Jan201512Jan2015

Where the duplicate entry is merged based on the observations being equal in Name, ID1, ID2, Cat1, Cat2, Cat3.

Help? Thank you!

Super User
Posts: 10,538

Re: Merge, not delete, duplicates.

How did you merge the data?

Contributor
Posts: 24

Re: Merge, not delete, duplicates.

I merged the data sets by Name, ID1, ID2, Cat1, Cat2, Cat3 but it didn't exactly work. So how can I do it after the datasets merged?

Super User
Posts: 10,538

Re: Merge, not delete, duplicates.

Code please. Merge using data step or proc sql? Are the combinations of your key variables repeated in either or both data sets?

Were there any warning messages?

And probably best to go back to the step that combined them then try to fix other stuff later.

Contributor
Posts: 24

Re: Merge, not delete, duplicates.

I merged by a data merge by statement.

I'll have to double check, but there were no errors. I confirmed that I sorted both data sets by the same variables that I merged them by. I think the issue lies around one of the cat variables - I had to remove a character string from one of the data sets so that the names matched across data sets. When I merged by that cat variable, I get the issue that I'm having now (see original post).

I'm not familiar with update. What exactly does it do?

Thank you for your help!

Super User
Posts: 17,912

Re: Merge, not delete, duplicates.

The 'trick' is having the same data set as the master and transaction dataset.

From the documentation:


In a DATA step, the UPDATE statement reads observations from the transaction data set and updates corresponding observations (observations with the same value of all BY variables) from the master data set. All nonmissing values for variables in the transaction data set replace the corresponding values that are read from the master data set. SAS writes the modified observations to the data set that you name in the DATA statement without modifying either the master or the transaction data set.

Step-by-Step Programming with Base SAS(R) Software

Super User
Posts: 17,912

Re: Merge, not delete, duplicates.

You can use the update trick instead:

data want;

update have(obs=0) have;

by name id1 id2 cat1 cat2 cat3;

run;

You may want to figure out why your merge didn't work though, because it is one way to do it.

Contributor
Posts: 24

Re: Merge, not delete, duplicates.

I figured it out! I had to trim some values in the one data set - here it was leaving extra spaces. I used TRIM and now when I merge, it merges great!

Thanks for your help.

Ask a Question
Discussion stats
  • 7 replies
  • 281 views
  • 0 likes
  • 3 in conversation