Help using Base SAS procedures

Removing row duplicates with value in diffrent variable

Accepted Solution Solved
Reply
Contributor
Posts: 30
Accepted Solution

Removing row duplicates with value in diffrent variable

Hi All ,

I have the below dataset

data x;

input source $3. fare  destination $3.;

datalines;

mum 500 del

del 500 mum

kol 600 che

che 600 kol

;

run;

i want only one oservation for a source and destination ie one obs for to and fro journey for ex   out of  del and mum and mum to del i need only one, same as for others .

the output should be like this

mum 500 del

kol 600 che

thanks

for the help in advance


Accepted Solutions
Solution
‎12-15-2014 08:46 AM
Super User
Super User
Posts: 7,988

Re: Removing row duplicates with value in diffrent variable

Posted in reply to naveen20jan

How about:


data want;
  set x;
  length forwards backwards $200.;
  forwards=catx(',',source,destination);
  backwards=catx(',',destination,source);
  if forwards=lag(forwards) or forwards=lag(backwards) then delete;
run;

View solution in original post


All Replies
Solution
‎12-15-2014 08:46 AM
Super User
Super User
Posts: 7,988

Re: Removing row duplicates with value in diffrent variable

Posted in reply to naveen20jan

How about:


data want;
  set x;
  length forwards backwards $200.;
  forwards=catx(',',source,destination);
  backwards=catx(',',destination,source);
  if forwards=lag(forwards) or forwards=lag(backwards) then delete;
run;

Contributor
Posts: 30

Re: Removing row duplicates with value in diffrent variable

Thanks RW9

Super User
Posts: 10,044

Re: Removing row duplicates with value in diffrent variable

Posted in reply to naveen20jan

Is there some order you need to consider ?

data x;
input source $3. fare  destination $3.;
datalines;
mum 500 del
del 500 mum
kol 600 che
che 600 kol
;
run;
data x;
 set x;
 s=source;
 d=destination;
 call sortc(s,d);
run;
proc sort data=x out=want nodupkey;by s d;run;

Xia Keshan

Contributor
Posts: 30

Re: Removing row duplicates with value in diffrent variable

Hi Xai ,

thanks for the help and its fine we dont need any order .

thanks

Respected Advisor
Posts: 3,156

Re: Removing row duplicates with value in diffrent variable

Posted in reply to naveen20jan

If your incoming data:

1. already have duplicates

2. same pair of "from-to" do not cluster together (could be any where in the table),

Then to use the solution by , or the following:

data x;

     input source $3. fare  destination $3.;

     datalines;

mum 500 del

del 500 mum

kol 600 che

che 600 kol

;

run;

proc sql;

     create table want (drop=grp n)  as

           select *, ifc(source <= destination, cats(source, destination), cats(destination,source)) as grp, monotonic() as n from x

                group by grp

                     having n=min(n)

     ;

quit;

Haikuo

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 340 views
  • 5 likes
  • 4 in conversation