Code:
PROC SORT DATA=ND NODUPS;
BY ITEM;
RUN;
PROC SORT DATA=DUPS;
BY ITEM;
RUN;
DATA COMBINE;
MERGE DUPS(IN=OK1) ND(IN=OK2);
BY ITEM;
IF OK1;
The 'ND' dataset should not have any duplicate rcds (log shows several were deleted). So why is log showing: "Note: MERGE statement has more than one data set with repeats of BY values." ???
Also, consider that in some instances (your input file determined) you must have a sufficient BY variable list to ensure that duplicate observations are sorted to be adjacent, otherwise the duplicates will not be deleted, with NODUPS.