09-12-2013 08:46 AM
Note:Merge statement has more than one data set with repeats of BY values.
I have tried to putting all the common variables from both data sets in "BY statement "and merged. Even though I am getting the above note.
09-12-2013 10:17 AM
This is a problem that you have to think through before you can write a program. The situation you are looking at is (almost certainly) a many-to-many merge:
merge a b;
What should happen if the same ID appears twice in data set A, and three times in data set B? How many observations should the combined data set contain for that ID? How do you want to match the set of 2 to the set of 3?
Perhaps the answer is to adjust one of the data sets so that it never contains multiple observations for the same ID. But these are the questions to consider before a program can be written.
09-12-2013 10:18 AM
When you merge two datasets, at least one of the datasets should be unique by the BY variables.
MERGE is good for one-to-one merging of datasets, and one-to-many.
Many-to-Many merges are almost always a very bad idea. A many-to-many merge does NOT produce a cartesian product. That note is telling you that you are doing a many-to-many merge.
If you want a cartesian product from a many-to-many merge, doing a join in SQL is probably the easiest approach (there are DATA STEP approaches as well, but not merge).
09-12-2013 11:06 AM
I have two data sets one is VS and other is adprdtrt.sas7bdat ... two have 7 common variables i have kept all the seven in BY statement.
there are more observations same .. variables common are Subjid usubjid studyid extn arm seq seqc ..
09-12-2013 11:11 AM
The warning is telling you there's something wrong with your merge, you need to clean the data first and understand the merge.
If you can find the observations that are causing the warning, merge them and see what the results look like and try and figure out what you want.
You can isolate the obs by running the program first with half of one dataset, if no error, then move on the second half of the dataset. Then repeat, splitting down until you find the groups that are causing the issue.
09-12-2013 11:52 AM
You are merging
by SubjID USubjID Extn Arm Seq SeqC
In order to avoid that warning, one of your datasets should be unique by those variables. Do you expect that to be true?
There are lots of ways to check for duplicates. One way with a data step is:
data _null_ ;
set VS ;
by SubjID USubjID Extn Arm Seq SeqC ;
if not (first.SeqC and last.SeqC) then put "Duplicate found ! " (SubjID USubjID Extn Arm Seq SeqC) (=) ;
09-12-2013 12:12 PM
This note will come when you do Many to many merge. no sorting is required. use Proc SQL, it is easy and more convenient. Proc sql; create table final as select a.*, b.x, b.xx, b.xxx, b.xxx From a Full join b on a.id = b.id; quit;