BookmarkSubscribeRSS Feed
PrinceAde
Obsidian | Level 7

please I need help merging multiple datsets(5 datasets); 

data adsl;
merge dm1(in=a) ds1x(in=b) ex3(in=c) vs1x(in=d) sv2(in=e);
by usubjid;
if in (a, b, c, d, e);
run;

When I merge these datasets it generate this message in the log; MERGE statement has more than one data set with repeats of BY values.

when I merged two at a time, I didn't get the message except with sv2;

data adslx;
merge dm1(in=a) ds1x(in=b);
by usubjid;
if a and b;
run;

data adsly;
merge vs1x(in=a) ex3(in=b);
by usubjid;
if a and b;
run;

The message( MERGE statement has more than one data set with repeats of BY values) occurred in the log again when decide to merge adslx and adsly

data merge_adsl;
merge adslx(in=a) adsly(in=b);
by usubjid;
if a and b;
run;

 

Please, how can I resolve this issue?

8 REPLIES 8
A_Kh
Lapis Lazuli | Level 10

Hi it would be great to provide a portion of data, even dummy, for testing.
There was a similar post https://communities.sas.com/t5/SAS-Studio/MERGE-statement-has-more-than-one-data-set-with-repeats-of....

I would also try the syntax :

if A and B and C and D and E;

in lieu of : if in (a, b, c, d, e);

PrinceAde
Obsidian | Level 7

The link was very helpful, I created unique usubjid before merging; My final target is one record per subject.

 

Thank you;

Astounding
PROC Star

The best solution would likely be to fix the data sets before you merge them.  More specifically, why should the same USUBJID have two observations in the same data set?  In fact, it has two (or more) observations in more than one incoming data set.  That's what the message tells you. 

 

If you can't make USUBJID unique, then you have to choose among more complex scenarios  For example, what result do you want if the same USUBJID appears three times in one data set, and twice in another?  That's a choice that depends on your knowledge of what is in the data (or what should be in the data).  Start with making USUBJID unique if that's at all possible.

PrinceAde
Obsidian | Level 7

My final target is one record per subject. I made the usubjid  unique before merging. It works.

I appreciate. 

Astounding
PROC Star

I hear you.  But SAS disagrees.  The message it is giving means that at least two of your incoming data sets contain multiple observations for a USUBJID.

 

Here's a program you can run to find the source of the problem in the DM1 data set:

proc freq data=dm1 noprint;
   tables usubjid / out=dm1_multiples (where=(count > 1) drop=percent);
run;
proc print data=dm1_multiples;
run;

Unfortunately, you will need to test all the data sets.  Since you are expecting no such problem, it is possible the problem lies with missing values:  multiple observations where USUBJID has a missing value. 

PrinceAde
Obsidian | Level 7

Hello sir. 

I'm sorry for replying so late.

DM1 contains 742 observations. When I applied the recommendation above, dm1_multiples gave me 0 obs.

The remaining datasets, sv2_multiples, vs1x_multiples, ex3_multiples, and ds1x_multiples all give 742 obs, which are all unique. This appears to have removed duplicate obs, just like nodupkeys.

I expect to get one obs per record in my final dataset, which is 742 observations.

What do you advise, sir?

Tom
Super User Tom
Super User

Which observation do you want when there are multiples?

The first one? 

data first;
  merge DM1 sv2  vs1x ex3 ds1x ;
  by usubjid;
  if first.usubjid ;
run;

or the last one?

data first;
  merge DM1 sv2  vs1x ex3 ds1x ;
  by usubjid;
  if last.usubjid ;
run;
ballardw
Super User

You may want to run this small example of merging sets with repeats of BY values in two (or more) sets to see what may happen with the values of other variables:

 

data ex1;
   input a b;
datalines;
1 1
1 2
1 3
1 4
;

data ex2;
   input a c;
datalines;
1 11
1 22
;

data merged;
   merge ex1 ex2;
   by a;
run;

The "resolution" depends on what you need for a result when multiple sets have multiple values of the BY variables. So you need describe what the resulting data is supposed to have in that situation.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 906 views
  • 0 likes
  • 5 in conversation