DATA Step, Macro, Functions and more

Observation number of duplicate record

Accepted Solution Solved
Reply
Contributor
Posts: 46
Accepted Solution

Observation number of duplicate record

Hi,

Is there a way to find out the observation number of duplicate record . I tried with _n_ but somehow am unable to find the exact row number.

Need some help in that

data source;

input a b c d;

cards;

1 2 3 4

5 6 7 8

9 10 11 12

1 2 3 4

9 10 11 12

5 6 7 8

;

run;

I want target as

abcdDup_row_num
12344
56786
91011125

I tried following , I know I need to combine these two queries but somehow is unable to find it

proc sql;

create table t1 as

   select monotonic() as row_number,a,b,c,d

   from source

;quit;

proc sql;

select max(row_number)  from

(

select count(*) as cnt, a,b,c,d, row_number from t1 group by a,b,c,d

)

;

quit;

Thanks in advance


Accepted Solutions
Solution
‎07-18-2014 12:05 AM
New Contributor
Posts: 4

Re: Observation number of duplicate record

Posted in reply to yashpande

Sorry !

data T1;

  set Source;

  row_number=_n_;

  run;

proc sort data=T1;

  by a b c d row_number;

  run;

data Result(rename=(row_number=dup_row_number));

  set T1;

  by a b c d row_number;

  if first.d=0 & last.d=1;

  run;

View solution in original post


All Replies
New Contributor
Posts: 4

Re: Observation number of duplicate record

Posted in reply to yashpande

*Method 1;

proc sort data=Source out=A1;

  by a b c d;

  run;

data Result(rename=(Cnt=Dup_row_num));

  set A1;

  by a b c d;

  retain Cnt;

  if first.d then Cnt=0;

  Cnt+1;

  if last.d & 0<Cnt then output;

  run;

*Method 2;

proc sql;

  create table Result(where=(1<Dup_row_num))

  as select a, b, c, d, count(*) as Dup_row_num from Source group by a,b,c,d;

  quit;

Contributor
Posts: 46

Re: Observation number of duplicate record

No scdent, I want the obswervation number of duplicate record. If you look at my source. First observation is repeated at row number 4 thats what I need to put in variable Dup_row_num

Solution
‎07-18-2014 12:05 AM
New Contributor
Posts: 4

Re: Observation number of duplicate record

Posted in reply to yashpande

Sorry !

data T1;

  set Source;

  row_number=_n_;

  run;

proc sort data=T1;

  by a b c d row_number;

  run;

data Result(rename=(row_number=dup_row_number));

  set T1;

  by a b c d row_number;

  if first.d=0 & last.d=1;

  run;

Super User
Super User
Posts: 7,079

Re: Observation number of duplicate record

Posted in reply to yashpande

You need to get the grouping variables in your selected list of variables.

data t1;

  row_number+1;

  input a b c d;

cards;

1 2 3 4

5 6 7 8

9 10 11 12

1 2 3 4

9 10 11 12

5 6 7 8

;

run;

proc sql;

select a,b,c,d

     , min(row_number) as first_row

     , max(row_number) as last_dup_row

  from t1

  group by a,b,c,d

  having count(*) > 1

  order by a,b,c,d

;

quit;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 216 views
  • 0 likes
  • 3 in conversation