BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ksharp
Super User
🙂
Sure. I have a boss ,since I am a workman .
Tom
Super User Tom
Super User

@BrahmanandaRao wrote:
data test;
input Empname $ ;
datalines;
ram
sita
ram
arjun
ram
sita
;
run;

Interview asked me a question without sorting how to remove duplicates 

using above dataset scenario he said donot chage order of empnames but remove duplicates only datastep method

 


In general a HASH (or some other method of remembering what values you have seen before) will do this.

data want;
  if _n_=1 then do;
   declare hash h();
   h.definekey('empname');
   h.definedone();
  end;
  set test ;
  if h.find() then do;
    output;
    h.add();
  end;
run;

But if the data is too large then HASH will not work (HASH needs to be in memory) as would any other DATA step only method.  In which case sorting is probably your best method. Either directly using PROC SORT or implicitly using PROC SQL code.  Just add a new variable to record the original order so it can be recreated.

data temp;
  row+1;
  set test;
run;
proc sql ;
create table want as
  select empname
  from temp
  group by empname
  having row=min(row)
  order by row
;
quit;

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 16 replies
  • 3373 views
  • 17 likes
  • 6 in conversation