BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yashpande
Obsidian | Level 7

Hi,

Is there a way to find out the observation number of duplicate record . I tried with _n_ but somehow am unable to find the exact row number.

Need some help in that

data source;

input a b c d;

cards;

1 2 3 4

5 6 7 8

9 10 11 12

1 2 3 4

9 10 11 12

5 6 7 8

;

run;

I want target as

abcdDup_row_num
12344
56786
91011125

I tried following , I know I need to combine these two queries but somehow is unable to find it

proc sql;

create table t1 as

   select monotonic() as row_number,a,b,c,d

   from source

;quit;

proc sql;

select max(row_number)  from

(

select count(*) as cnt, a,b,c,d, row_number from t1 group by a,b,c,d

)

;

quit;

Thanks in advance

1 ACCEPTED SOLUTION

Accepted Solutions
scdent
Obsidian | Level 7

Sorry !

data T1;

  set Source;

  row_number=_n_;

  run;

proc sort data=T1;

  by a b c d row_number;

  run;

data Result(rename=(row_number=dup_row_number));

  set T1;

  by a b c d row_number;

  if first.d=0 & last.d=1;

  run;

View solution in original post

4 REPLIES 4
scdent
Obsidian | Level 7

*Method 1;

proc sort data=Source out=A1;

  by a b c d;

  run;

data Result(rename=(Cnt=Dup_row_num));

  set A1;

  by a b c d;

  retain Cnt;

  if first.d then Cnt=0;

  Cnt+1;

  if last.d & 0<Cnt then output;

  run;

*Method 2;

proc sql;

  create table Result(where=(1<Dup_row_num))

  as select a, b, c, d, count(*) as Dup_row_num from Source group by a,b,c,d;

  quit;

yashpande
Obsidian | Level 7

No scdent, I want the obswervation number of duplicate record. If you look at my source. First observation is repeated at row number 4 thats what I need to put in variable Dup_row_num

scdent
Obsidian | Level 7

Sorry !

data T1;

  set Source;

  row_number=_n_;

  run;

proc sort data=T1;

  by a b c d row_number;

  run;

data Result(rename=(row_number=dup_row_number));

  set T1;

  by a b c d row_number;

  if first.d=0 & last.d=1;

  run;

Tom
Super User Tom
Super User

You need to get the grouping variables in your selected list of variables.

data t1;

  row_number+1;

  input a b c d;

cards;

1 2 3 4

5 6 7 8

9 10 11 12

1 2 3 4

9 10 11 12

5 6 7 8

;

run;

proc sql;

select a,b,c,d

     , min(row_number) as first_row

     , max(row_number) as last_dup_row

  from t1

  group by a,b,c,d

  having count(*) > 1

  order by a,b,c,d

;

quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 943 views
  • 0 likes
  • 3 in conversation