BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
zetter
Calcite | Level 5

Hi I’ve got a table like this

ID       fruit                     colour

555     avocado             orange

111    orange                 orange

111    cabbage             green

123     mango                 green

333    strawberry             red

333    strawberry             red

555      berry                     orange

I want sas to look at all of the IDS above and flag if similar ID exist, like this:

ID    fruit                 colour               duplicate?

555  avocado         Orange            No

111  orange            Orange             No

111  cabbage         green             Yes

123   mango            green               No

333  strawberry       red                   No

333  strawberry       red                   Yes

555  berry                 orange              Yes

  Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
Hima
Obsidian | Level 7

DATA HAVE;
INPUT ID       fruit    $                 colour $;
DATALINES;
555     avocado               orange
111    orange                 orange
111    cabbage                green
123     mango                 green
333    strawberry             red
333    strawberry             red
555      berry                orange
;
RUN;

PROC SORT DATA=HAVE;
BY ID;
RUN;

DATA WANT;
SET HAVE;
LENGTH DUPLICATE $3.;
BY ID;
IF not(FIRST.ID) THEN
  DUPLICATE='yes';
  ELSE DUPLICATE='NO';
RUN;

Capture.JPG

View solution in original post

5 REPLIES 5
stat_sas
Ammonite | Level 13

proc sort data=have;
by id;
run;

data want;
set have;
length duplicate $5.;
by id;
duplicate='No';
if last.id then duplicate='Yes';
if first.id and last.id then duplicate='No';
run;

Haikuo
Onyx | Level 15

if you want to maintain the original data order:

data have;

     input ID       (fruit                     colour ) (:$10.);

     cards;

555     avocado             orange

111    orange                 orange

111    cabbage             green

123     mango                 green

333    strawberry             red

333    strawberry             red

555      berry                     orange

;

data want;

     if _n_=1 then

           do;

                dcl hash h();

                h.definekey('id');

                h.definedone();

           end;

     set have;

     length dup $3;

     if h.check()=0 then

           dup='Yes';

     else

           do;

                rc=h.add();

                dup='No';

           end;

     drop rc;

run;

Haikuo

Hima
Obsidian | Level 7

DATA HAVE;
INPUT ID       fruit    $                 colour $;
DATALINES;
555     avocado               orange
111    orange                 orange
111    cabbage                green
123     mango                 green
333    strawberry             red
333    strawberry             red
555      berry                orange
;
RUN;

PROC SORT DATA=HAVE;
BY ID;
RUN;

DATA WANT;
SET HAVE;
LENGTH DUPLICATE $3.;
BY ID;
IF not(FIRST.ID) THEN
  DUPLICATE='yes';
  ELSE DUPLICATE='NO';
RUN;

Capture.JPG

Tom
Super User Tom
Super User

You as basically asking for NOT FIRST.ID.

data have ;

  input id fruit :$10. color :$10. ;

cards;

555 avocado orange

111 orange orange

111 cabbage green

123 mango green

333 strawberry red

333 strawberry red

555 berry orange

run;

proc sort; by id; run;

data want ;

  set have ;

  by id ;

  if first.id then dup='NO ';

  else dup='YES';

  put (_all_) (:);

run;

111 orange orange NO

111 cabbage green YES

123 mango green NO

333 strawberry red NO

333 strawberry red YES

555 avocado orange NO

555 berry orange YES


amats
Calcite | Level 5

Another way.

proc sort data=HAVE out=WANT1 dupout=WANT2 nodupkey;

  by ID;

run;

data WANT;

  set WANT1 WANT2 (in=_IN);

  by ID;

  if _IN then DUP = "Yes";

  else DUP = "No";

run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 692 views
  • 0 likes
  • 6 in conversation