BookmarkSubscribeRSS Feed
bharath86
Obsidian | Level 7

Hi,

How to get rid of this note 'Merge statement' from the log. I understand there are duplicates in the code but any alternate way of doing this. 

please advise. 

Thank you

845        data wan;
846        merge XM_Pdms_1 (drop=XMDY) XM_Pdms_1 (where=(XMDY ne .));
847        by USUBJID VISITNUM ;
848        run;

NOTE: MERGE statement has more than one data set with repeats of BY values.
NOTE: There were 19662 observations read from the data set WORK.XM_PDMS_1.
NOTE: There were 16950 observations read from the data set WORK.XM_PDMS_1.
      WHERE XMDY not = .;
NOTE: The data set WORK.WAN has 19662 observations and 25 variables.
NOTE: DATA statement used (Total process time):
      real time           0.24 seconds
      cpu time            0.20 seconds
      
2 REPLIES 2
ballardw
Super User

Before asking about suppressing a note in the log did you look at the resulting data? Is it correct or what you expect?

Almost 100% of the time that note means that the result is likely not what you want.

So the question is what should the result look like?

 

I suggest making small examples of the two data sets with some duplicates of the by variables and some records not duplicating the by variables, and what the result is intended to be and share all three data sets. Otherwise we do not know what your intent is.The example datasets should include only one or two other variables but the values of those variables need to of sufficient variety to see how they are treated.

The solution will likely involve Proc SQL and a Join. But there are several different joins and which to use depends on what the expected result looks like.

Kurt_Bremser
Super User

A data step MERGE is not the correct tool for handling m:n relationships, it's good for 1:1, 1:n and n:1.

Depending on your intentions, you should either

  • use SQL and build a cartesian join
  • deduplicate one or both datasets before doing the MERGE

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1153 views
  • 1 like
  • 3 in conversation