Solved: How to select unique records using datetime variable

smunigala · Posted 10-31-2016 12:53 PM

Hi all,

I have a dataset looking at orthopedic surgery procedures over time. Date of procedure is recorded as DATETIME20. I have some duplicate entries for same procedure on same date where some entries show datetime exactly (for eg. 10OCT2016 04:25:10) and some show just the date and 0 for time (for eg. 10OCT2016 00:00:00). FOr those with duplicate entries, I need to pick up the ones with exact date time (10OCT2016 04:25:10) and delete one with no time on it (10OCT2016 00:00:00).

However, I have some observations with no duplicate entries on same date, but with date and 0 for time which I need to retain (for eg. 11OCT2016 00:00:00).

I would appreciate if I can get help in removing duplicate entries for same day with 0 time value but retaining unique observations with date and 0 time value.

ID variable: Reg_No

Date of Surgery: PROCEDURE_DATE (datetime 20.)

Here is the sample data from my study. I just want to remove the one I marked DELETE.

Reg_No PROCEDURE_DATE

1001 11OCT2016 00:00:00 --Retain

1001 12OCT2016 00:00:00 --Delete

1001 12OCT2016 01:20:01 --Retain

1001 13OCT2016 00:00:00 --Retain

1001 14OCT2016 02:30:10 --Retain

1002 13OCT2016 00:00:00 --Retain

1002 14OCT2016 22:01:00 --Retain

1002 15OCT2016 00:00:00 --Retain

1003 15OCT2016 00:00:00 --Retain

Thank you!

Sat

sbxkoenk · Posted 10-31-2016 01:37 PM

The below solution supports the situation where you have 2+ records for the same Reg_no and same day.

It will throw away the first occurrence if and only if the time = 00:00:00 while keeping all other records for that same Reg_no and same day.

data work.smunigala;
input Reg_No PROCEDURE_DATE_Char $ 15-32;
PROCEDURE_DATE=input(PROCEDURE_DATE_Char,datetime20.);
procedure_date_dp=datepart(PROCEDURE_DATE);
procedure_date_tp=timepart(PROCEDURE_DATE);
format PROCEDURE_DATE datetime20. procedure_date_dp date9. procedure_date_tp time8.;
datalines;
1001          11OCT2016 00:00:00
1001          12OCT2016 00:00:00
1001          12OCT2016 01:20:01
1001          13OCT2016 00:00:00
1001          14OCT2016 02:30:10
1002          13OCT2016 00:00:00
1002          14OCT2016 22:01:00
1002          15OCT2016 00:00:00
1003          15OCT2016 00:00:00
;
run;

proc sort data=work.smunigala NODUPKEY;
 by Reg_No procedure_date_dp procedure_date_tp;
run;

data work.smunigala1;
 set work.smunigala;
 by Reg_No procedure_date_dp procedure_date_tp;
 if first.procedure_date_dp then do;
  if     first.procedure_date_dp NE last.procedure_date_dp
     AND procedure_date_tp = '00:00:00't
   then delete;
  end;
run;
/* end of program */

Regards,

Koen

View solution in original post

ballardw · Posted 10-31-2016 01:21 PM

If the data is sorted by Reg_no and procedure_date then:

data temp;

set have;

dateonly=datepart (procedure_date);

run;

data want;

set temp;

by reg_no dateonly;

if last.dateonly;

run;

seems to do what you request.

smunigala · Posted 10-31-2016 01:27 PM

Hi,

Thanks for the quick reply. However, I am looking to remove duplicates only for the same day and not for different days. Can you check the sample data I specified? Each Reg_No has unique procedure dates (with and without time) which I want to retain. Only if a Reg_No has duplicate entries on same day, I need to pick up the procedure dates qith specified time.

Thanks,

Satish

ballardw · Posted 10-31-2016 03:19 PM

@smunigala wrote:

Hi,

Thanks for the quick reply. However, I am looking to remove duplicates only for the same day and not for different days. Can you check the sample data I specified? Each Reg_No has unique procedure dates (with and without time) which I want to retain. Only if a Reg_No has duplicate entries on same day, I need to pick up the procedure dates qith specified time.

Thanks,

Satish

Did you try the code?

Do you have records with more than 2 of the same date?

sbxkoenk · Posted 10-31-2016 01:37 PM

The below solution supports the situation where you have 2+ records for the same Reg_no and same day.

It will throw away the first occurrence if and only if the time = 00:00:00 while keeping all other records for that same Reg_no and same day.

data work.smunigala;
input Reg_No PROCEDURE_DATE_Char $ 15-32;
PROCEDURE_DATE=input(PROCEDURE_DATE_Char,datetime20.);
procedure_date_dp=datepart(PROCEDURE_DATE);
procedure_date_tp=timepart(PROCEDURE_DATE);
format PROCEDURE_DATE datetime20. procedure_date_dp date9. procedure_date_tp time8.;
datalines;
1001          11OCT2016 00:00:00
1001          12OCT2016 00:00:00
1001          12OCT2016 01:20:01
1001          13OCT2016 00:00:00
1001          14OCT2016 02:30:10
1002          13OCT2016 00:00:00
1002          14OCT2016 22:01:00
1002          15OCT2016 00:00:00
1003          15OCT2016 00:00:00
;
run;

proc sort data=work.smunigala NODUPKEY;
 by Reg_No procedure_date_dp procedure_date_tp;
run;

data work.smunigala1;
 set work.smunigala;
 by Reg_No procedure_date_dp procedure_date_tp;
 if first.procedure_date_dp then do;
  if     first.procedure_date_dp NE last.procedure_date_dp
     AND procedure_date_tp = '00:00:00't
   then delete;
  end;
run;
/* end of program */

Regards,

Koen

sbxkoenk · Posted 10-31-2016 01:43 PM

Quick addendum: if there is only one record for a Reg_no & day combination, it is kept in the output in any case (even if the time was 00:00:00).
Koen

jhlaramore · Posted 10-31-2016 02:04 PM

proc sql;
 create table NEWDATASETNAME(drop=count_flag time_flag) as 
 select *
 , case when count(reg_no)>1 then 1 else 0 end as count_flag
 , case when timepart(procedure_date)='00:00:00't then 1 else 0 end as time_flag
 from CURRENTDATASETNAME
 group by reg_no,datepart(procedure_date)
 having calculated count_flag + calculated time_flag < 2;
quit;

smunigala · Posted 11-10-2016 11:34 AM

Thank you all for helping me with this!

Sat

How to select unique records using datetime variable

Re: How to select unique records using datetime variable

Re: How to select unique records using datetime variable

Re: How to select unique records using datetime variable

Re: How to select unique records using datetime variable

Re: How to select unique records using datetime variable

Re: How to select unique records using datetime variable

Re: How to select unique records using datetime variable

Re: How to select unique records using datetime variable

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away