BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yashpande
Obsidian | Level 7

Hi,

Is there a way to find out the observation number of duplicate record . I tried with _n_ but somehow am unable to find the exact row number.

Need some help in that

data source;

input a b c d;

cards;

1 2 3 4

5 6 7 8

9 10 11 12

1 2 3 4

9 10 11 12

5 6 7 8

;

run;

I want target as

abcdDup_row_num
12344
56786
91011125

I tried following , I know I need to combine these two queries but somehow is unable to find it

proc sql;

create table t1 as

   select monotonic() as row_number,a,b,c,d

   from source

;quit;

proc sql;

select max(row_number)  from

(

select count(*) as cnt, a,b,c,d, row_number from t1 group by a,b,c,d

)

;

quit;

Thanks in advance

1 ACCEPTED SOLUTION

Accepted Solutions
scdent
Obsidian | Level 7

Sorry !

data T1;

  set Source;

  row_number=_n_;

  run;

proc sort data=T1;

  by a b c d row_number;

  run;

data Result(rename=(row_number=dup_row_number));

  set T1;

  by a b c d row_number;

  if first.d=0 & last.d=1;

  run;

View solution in original post

4 REPLIES 4
scdent
Obsidian | Level 7

*Method 1;

proc sort data=Source out=A1;

  by a b c d;

  run;

data Result(rename=(Cnt=Dup_row_num));

  set A1;

  by a b c d;

  retain Cnt;

  if first.d then Cnt=0;

  Cnt+1;

  if last.d & 0<Cnt then output;

  run;

*Method 2;

proc sql;

  create table Result(where=(1<Dup_row_num))

  as select a, b, c, d, count(*) as Dup_row_num from Source group by a,b,c,d;

  quit;

yashpande
Obsidian | Level 7

No scdent, I want the obswervation number of duplicate record. If you look at my source. First observation is repeated at row number 4 thats what I need to put in variable Dup_row_num

scdent
Obsidian | Level 7

Sorry !

data T1;

  set Source;

  row_number=_n_;

  run;

proc sort data=T1;

  by a b c d row_number;

  run;

data Result(rename=(row_number=dup_row_number));

  set T1;

  by a b c d row_number;

  if first.d=0 & last.d=1;

  run;

Tom
Super User Tom
Super User

You need to get the grouping variables in your selected list of variables.

data t1;

  row_number+1;

  input a b c d;

cards;

1 2 3 4

5 6 7 8

9 10 11 12

1 2 3 4

9 10 11 12

5 6 7 8

;

run;

proc sql;

select a,b,c,d

     , min(row_number) as first_row

     , max(row_number) as last_dup_row

  from t1

  group by a,b,c,d

  having count(*) > 1

  order by a,b,c,d

;

quit;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1261 views
  • 0 likes
  • 3 in conversation