Hi all,
I have a dataset in below format. I need to remove the duplicates by ID var1 and dt. The date variable is in varchar datatype.
The date has a problem with same date but different timing. I have to remove the minimum timings on same date. In the below sample
ID 123 has two height value on the same date but I need only one value and that should be the max time. In the same scenario ID 321 has in wt category.
data have; input ID var1$ val dt$21.; cards; 123 ht 145 2019-09-14 09:06:00.0 123 wt 78 2019-08-12 06:43:42.0 123 bp 24.8 2019-09-30 14:33:47.0 123 ht 146 2019-09-14 13:38:45.0 321 ht 173 2019-12-06 11:20:53.0 321 wt 85 2019-08-16 09:54:39.0 321 ht 173 2019-09-17 13:15:22.0 321 wt 86 2019-08-16 10:50:03.0 ;
Thanks in advance!
1. Convert date to a date time that can be sorted properly
2. Pick first record by each date
*convert to date/datetime;
data temp;
set have;
myDateTime = input(dt, anydtdtm.);
myDate = datepart(myDateTime);
myTime = timepart(myDateTime);
format myDatetime datetime. myDate date9. myTime time.;
run;
*sort data;
proc sort data=temp;
by ID var1 myDate descending myTime;
run;
*keep only record of interest;
data want;
set temp;
by ID var1 myDate descending myTime;
if first.myDate;
run;
@Sathish_jammy wrote:
Hi all,
I have a dataset in below format. I need to remove the duplicates by ID var1 and dt. The date variable is in varchar datatype.
The date has a problem with same date but different timing. I have to remove the minimum timings on same date. In the below sample
ID 123 has two height value on the same date but I need only one value and that should be the max time. In the same scenario ID 321 has in wt category.
data have; input ID var1$ val dt$21.; cards; 123 ht 145 2019-09-14 09:06:00.0 123 wt 78 2019-08-12 06:43:42.0 123 bp 24.8 2019-09-30 14:33:47.0 123 ht 146 2019-09-14 13:38:45.0 321 ht 173 2019-12-06 11:20:53.0 321 wt 85 2019-08-16 09:54:39.0 321 ht 173 2019-09-17 13:15:22.0 321 wt 86 2019-08-16 10:50:03.0 ;Thanks in advance!
1. Convert date to a date time that can be sorted properly
2. Pick first record by each date
*convert to date/datetime;
data temp;
set have;
myDateTime = input(dt, anydtdtm.);
myDate = datepart(myDateTime);
myTime = timepart(myDateTime);
format myDatetime datetime. myDate date9. myTime time.;
run;
*sort data;
proc sort data=temp;
by ID var1 myDate descending myTime;
run;
*keep only record of interest;
data want;
set temp;
by ID var1 myDate descending myTime;
if first.myDate;
run;
@Sathish_jammy wrote:
Hi all,
I have a dataset in below format. I need to remove the duplicates by ID var1 and dt. The date variable is in varchar datatype.
The date has a problem with same date but different timing. I have to remove the minimum timings on same date. In the below sample
ID 123 has two height value on the same date but I need only one value and that should be the max time. In the same scenario ID 321 has in wt category.
data have; input ID var1$ val dt$21.; cards; 123 ht 145 2019-09-14 09:06:00.0 123 wt 78 2019-08-12 06:43:42.0 123 bp 24.8 2019-09-30 14:33:47.0 123 ht 146 2019-09-14 13:38:45.0 321 ht 173 2019-12-06 11:20:53.0 321 wt 85 2019-08-16 09:54:39.0 321 ht 173 2019-09-17 13:15:22.0 321 wt 86 2019-08-16 10:50:03.0 ;Thanks in advance!
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.