BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kp19
Fluorite | Level 6

Hi,

 

I want to remove duplicates within same by group which have higher value of time difference variable.

The current data:

Obs ID    date1         time1   date2       time2     TimeDifference

1      1    03/08/20     11:00   03/08/20   10:55      5

2      1    03/08/20     11:00   03/08/20   10:15     45

3      1    03/08/20     11:00   03/08/20   9:30       90

i want to delete last two observations. I want to do this for multiple IDs with multiple such rows.

 

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Here is one way. I added another ID for demonstration

 

data have;
input ID date1 :mmddyy8. time1 :time5. date2 :mmddyy8. time2 :time5. TimeDifference;
format date: mmddyy8. time1 time2 time5.;
datalines;
1 03/08/20 11:00 03/08/20 10:55 5 
1 03/08/20 11:00 03/08/20 10:15 45
1 03/08/20 11:00 03/08/20 9:30  90
2 03/08/20 11:00 03/08/20 10:55 5 
2 03/08/20 11:00 03/08/20 10:15 45
2 03/08/20 11:00 03/08/20 9:30  90
;

data want (drop=minTD);
    minTD = 1e9;
    do _N_=1 by 1 until (last.ID);
        set have;
        by ID;
        if TimeDifference < minTD then minTD = TimeDifference;
    end;
    do _N_=1 to _N_;
        set have;
        if minTD = TimeDifference then output;
    end;
run;

View solution in original post

6 REPLIES 6
PeterClemmensen
Tourmaline | Level 20

What if two obs have timedif=5?

kp19
Fluorite | Level 6

They don't have same TimeDiff. 

novinosrin
Tourmaline | Level 20

Hi @kp19  Do you mean this?

 


proc sql;
create table want as
select *
from have
group by id
having TimeDifference=min(TimeDifference);
quit;

Or you could keep it simple by sorting the dataset by id timedifference, and picking the first obs within a by group i.e. 

 

proc sort data=have;
by id timedifference;
run;

data want;
 set have;
 by id;
 if first.id;
run;
PeterClemmensen
Tourmaline | Level 20

Here is one way. I added another ID for demonstration

 

data have;
input ID date1 :mmddyy8. time1 :time5. date2 :mmddyy8. time2 :time5. TimeDifference;
format date: mmddyy8. time1 time2 time5.;
datalines;
1 03/08/20 11:00 03/08/20 10:55 5 
1 03/08/20 11:00 03/08/20 10:15 45
1 03/08/20 11:00 03/08/20 9:30  90
2 03/08/20 11:00 03/08/20 10:55 5 
2 03/08/20 11:00 03/08/20 10:15 45
2 03/08/20 11:00 03/08/20 9:30  90
;

data want (drop=minTD);
    minTD = 1e9;
    do _N_=1 by 1 until (last.ID);
        set have;
        by ID;
        if TimeDifference < minTD then minTD = TimeDifference;
    end;
    do _N_=1 to _N_;
        set have;
        if minTD = TimeDifference then output;
    end;
run;
kp19
Fluorite | Level 6

@PeterClemmensen This works well. Thanks a lot

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1156 views
  • 1 like
  • 3 in conversation