01-27-2015 05:07 PM
I have a data set that has a large about of information between two systems. The observations in the data include a Name, an ID1, ID2, Cat1, Cat2, Cat3, Date in System1, Date in System2.
I expect all entries to have dates in both systems. Right now, each "Name" is listed twice:
What I would like to see is:
Where the duplicate entry is merged based on the observations being equal in Name, ID1, ID2, Cat1, Cat2, Cat3.
Help? Thank you!
01-27-2015 07:30 PM
Code please. Merge using data step or proc sql? Are the combinations of your key variables repeated in either or both data sets?
Were there any warning messages?
And probably best to go back to the step that combined them then try to fix other stuff later.
01-27-2015 07:46 PM
I merged by a data merge by statement.
I'll have to double check, but there were no errors. I confirmed that I sorted both data sets by the same variables that I merged them by. I think the issue lies around one of the cat variables - I had to remove a character string from one of the data sets so that the names matched across data sets. When I merged by that cat variable, I get the issue that I'm having now (see original post).
I'm not familiar with update. What exactly does it do?
Thank you for your help!
01-27-2015 08:01 PM
The 'trick' is having the same data set as the master and transaction dataset.
From the documentation:
In a DATA step, the UPDATE statement reads observations from the transaction data set and updates corresponding observations (observations with the same value of all BY variables) from the master data set. All nonmissing values for variables in the transaction data set replace the corresponding values that are read from the master data set. SAS writes the modified observations to the data set that you name in the DATA statement without modifying either the master or the transaction data set.
01-27-2015 07:33 PM
You can use the update trick instead:
update have(obs=0) have;
by name id1 id2 cat1 cat2 cat3;
You may want to figure out why your merge didn't work though, because it is one way to do it.
01-27-2015 10:43 PM
I figured it out! I had to trim some values in the one data set - here it was leaving extra spaces. I used TRIM and now when I merge, it merges great!
Thanks for your help.