BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
user24feb
Barite | Level 11


Hello!

I would like to filter the duplicate records of data set A, this is I would like to kick out observations with Nr >=2 -  If (EoF) & (Nr ge 2) Then H.Output(Dataset:"Duplicates"); doesn't work.

The following program works but keeps single observations:

Data A;
  Do i=1 To 30;
    ID=Byte(Int(RanUni(1)*26)+65);
    Output;
  End;
Run;

Data _NULL_;
  Length Nr 3.;
  If _N_ eq 1 Then Do;
    Declare Hash H();
H.DefineKey("ID");
H.DefineData("ID", "i", "Nr");
H.DefineDone();
  End;
  Set A End=EoF;
  If H.Find() ne 0 Then Do;
    Nr=1;
H.Add();
  End;
  Else Do;
    Nr+1;
H.Replace();
  End;
  If EoF Then H.Output(Dataset:"Duplicates");
Run;

My 2nd question is, how can I find duplicate records (not count them) of dataset "A" using a hash object?

Thanks&kind regards

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

If I understood what you mean.



 
Data A;
  Do i=1 To 30;
    ID=Byte(Int(RanUni(1)*26)+65);
    Output;
  End;
Run;
data _null_;
 if _n_ eq 1 then do;
  if 0 then set a;
  declare hash h();
  h.definekey('id');
  h.definedata('id','n');
  h.definedone();
end;
set a end=last;
if h.find()=0 then do;n+1;h.replace();end;
 else do;n=1;h.replace();end;
if last then do;
 h.output(dataset:'singual(where=(n=1))');
 h.output(dataset:'duplicate(where=(n gt 1))');
end;
run;

Xia Keshan

View solution in original post

2 REPLIES 2
Ksharp
Super User

If I understood what you mean.



 
Data A;
  Do i=1 To 30;
    ID=Byte(Int(RanUni(1)*26)+65);
    Output;
  End;
Run;
data _null_;
 if _n_ eq 1 then do;
  if 0 then set a;
  declare hash h();
  h.definekey('id');
  h.definedata('id','n');
  h.definedone();
end;
set a end=last;
if h.find()=0 then do;n+1;h.replace();end;
 else do;n=1;h.replace();end;
if last then do;
 h.output(dataset:'singual(where=(n=1))');
 h.output(dataset:'duplicate(where=(n gt 1))');
end;
run;

Xia Keshan

user24feb
Barite | Level 11

Yes, that's exactly what a meant. I didn't think to put a where-statement after the dataset. Many thanks!

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 823 views
  • 0 likes
  • 2 in conversation