Hello!
I would like to filter the duplicate records of data set A, this is I would like to kick out observations with Nr >=2 - If (EoF) & (Nr ge 2) Then H.Output(Dataset:"Duplicates"); doesn't work.
The following program works but keeps single observations:
Data A;
Do i=1 To 30;
ID=Byte(Int(RanUni(1)*26)+65);
Output;
End;
Run;
Data _NULL_;
Length Nr 3.;
If _N_ eq 1 Then Do;
Declare Hash H();
H.DefineKey("ID");
H.DefineData("ID", "i", "Nr");
H.DefineDone();
End;
Set A End=EoF;
If H.Find() ne 0 Then Do;
Nr=1;
H.Add();
End;
Else Do;
Nr+1;
H.Replace();
End;
If EoF Then H.Output(Dataset:"Duplicates");
Run;
My 2nd question is, how can I find duplicate records (not count them) of dataset "A" using a hash object?
Thanks&kind regards
If I understood what you mean.
Data A; Do i=1 To 30; ID=Byte(Int(RanUni(1)*26)+65); Output; End; Run; data _null_; if _n_ eq 1 then do; if 0 then set a; declare hash h(); h.definekey('id'); h.definedata('id','n'); h.definedone(); end; set a end=last; if h.find()=0 then do;n+1;h.replace();end; else do;n=1;h.replace();end; if last then do; h.output(dataset:'singual(where=(n=1))'); h.output(dataset:'duplicate(where=(n gt 1))'); end; run;
Xia Keshan
If I understood what you mean.
Data A; Do i=1 To 30; ID=Byte(Int(RanUni(1)*26)+65); Output; End; Run; data _null_; if _n_ eq 1 then do; if 0 then set a; declare hash h(); h.definekey('id'); h.definedata('id','n'); h.definedone(); end; set a end=last; if h.find()=0 then do;n+1;h.replace();end; else do;n=1;h.replace();end; if last then do; h.output(dataset:'singual(where=(n=1))'); h.output(dataset:'duplicate(where=(n gt 1))'); end; run;
Xia Keshan
Yes, that's exactly what a meant. I didn't think to put a where-statement after the dataset. Many thanks!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.