DATA Step, Macro, Functions and more

Repeat of BY values -- why?

Reply
New Contributor
Posts: 4

Repeat of BY values -- why?

Code:
PROC SORT DATA=ND NODUPS;
BY ITEM;
RUN;
PROC SORT DATA=DUPS;
BY ITEM;
RUN;
DATA COMBINE;
MERGE DUPS(IN=OK1) ND(IN=OK2);
BY ITEM;
IF OK1;

The 'ND' dataset should not have any duplicate rcds (log shows several were deleted). So why is log showing: "Note: MERGE statement has more than one data set with repeats of BY values." ???
Super Contributor
Posts: 359

Re: Repeat of BY values -- why?

NODUPS means the entire record is a duplicate NODUPKEY would rid you of duplicate BY values.
New Contributor
Posts: 4

Re: Repeat of BY values -- why?

DUH - I should know that! Thanks you have turned this Fri 13 into my lucky day...
Super Contributor
Super Contributor
Posts: 3,174

Re: Repeat of BY values -- why?

Also, consider that in some instances (your input file determined) you must have a sufficient BY variable list to ensure that duplicate observations are sorted to be adjacent, otherwise the duplicates will not be deleted, with NODUPS.

Scott Barry
SBBWorks, Inc.
Valued Guide
Posts: 2,177

Re: Repeat of BY values -- why?

check the lengths of the column/variable ITEM in each of your data sets

2541 - Multiple lengths were specified for the BY variable xxxx by input data sets
http://support.sas.com/kb/2/541.html
SUGI 28: Danger: MERGE Ahead! Warning: BY Variable with Multiple Lengths!
http://www2.sas.com/proceedings/sugi28/098-28.pdf
Ask a Question
Discussion stats
  • 4 replies
  • 165 views
  • 0 likes
  • 4 in conversation