Hi:
Please don't overthink the question. The NOTE in the log about duplicate BY values is OK. You don't have to fix the data or adjust your code. Here's what I mean. In my code below, I did NOT use the RESULTS library, I just wrote to the WORK library for the program to make sure I get the right answers before I save the program:
All you have to do is write your program and review the results. When I write the program the correct answers are contained in the SAS log. Remember that if you drop ALL the variables that start with EX:, then you'll drop ALL the variables from input08a that start with EX -- PLUS -- all the variables from input08b that start with EX in order to create the NOMATCH file. But you'll keep them for the MATCH file from the MERGE.
Here's what I mean:
The number of observations or rows is also easy to figure out. There are 1200 rows or obs in input08a and 1202 rows in input08b. There are 2 rows where ID = 401 that are in input08b but not in input08a. Are there duplicates of the BY values, yes there are. But that scenario, only resulted in a NOTE. And while it is a note that can adversely impact your output, in this case although there are duplicates, they do match in both files until you get to ID=401. So the rows for 401 are the only non-matches in this scenario, which is why you have 1200 obs in the match file and only 2 obs in the nomatch file.
I hope this helps to explain this question a bit more.
Cynthia
... View more