Help using Base SAS procedures

dedeplicating on most recent date

Reply
N/A
Posts: 0

dedeplicating on most recent date

How do you deduplicate based on most recent date?

PROC SORT data=total;
by week ID;
run;
PROC FREQ DATA = total noprint;
by week;
table ID/ out = ID_DUPS (keep = week ID Count where = (Count > 1)) ;
run;
PROC PRINT DATA = ID_DUPS;
run;
Super Contributor
Posts: 359

Re: dedeplicating on most recent date

for the last week within an ID use:

PROC SORT data=total;
by ID week ;
run;

data total;
set total;
by ID week;
if last.ID;
run;
N/A
Posts: 0

Re: dedeplicating on most recent date

Thanks. How do you deduplicate selecting for the latest date within a week? There's a unique time/date stamp field within each week.
Super Contributor
Super Contributor
Posts: 3,174

Re: dedeplicating on most recent date

Separate the DATEPART (it's a function in SAS) to create a SAS DATE variable and use it with your BY processing.

Scott Barry
SBBWorks, Inc.
N/A
Posts: 0

Re: dedeplicating on most recent date

Thanks. Once I get SAS to create a SAS date, how do I ask SAS to identify the most recent date and select for it?
Super Contributor
Super Contributor
Posts: 3,174

Re: dedeplicating on most recent date

Refer to your other related post for info on BY GROUP processing.

Scott
N/A
Posts: 0

Re: dedeplicating on most recent date

Thanks!
N/A
Posts: 0

Re: dedeplicating on most recent date

Consolidated from related post:



You have a couple of options - one is to perform two sorts with ay BY ID DESCENDING DATE; on the first sort and then followed with a SORT NODUPKEY EQUALS and a BY ID; statement.

The other option is to again sort with ID DESCENDING DATE, and then use a SAS DATA step with a SET and a BY ID; statement -- and use BY GROUP processing with IF FIRST.ID processing to subset your input and only capture the first occurence of ID values.

Scott Barry

Suggested Google advanced search argument on this topic:

data step by group processing site:sas.com
Super User
Posts: 10,516

Re: dedeplicating on most recent date

> Thanks. How do you deduplicate selecting for the
> latest date within a week? There's a unique
> time/date stamp field within each week.

Depending on how you're using this:

Proc summary data=(input data) nway;
class id week;
var (variable with date/time stamp as a SAS datetime variable);
output out=(output data set name) max=;
run;

May want to have option (drop= _type_ _freq_) with the output data. The output dataset will have the largest date/time within each combination of ID and WEEK. As a side effect the output data will be sorted by ID and week but no sort is needed beforehand.
Respected Advisor
Posts: 3,899

Re: dedeplicating on most recent date

I hope this example code will point you into the right direction.

The main idea is to use intnx() to align your datestamps to the date of the last day of the week the datestamp value is from (Sunday) and use this additional variable "LastDayInWeek" for sorting.


data have;
id=1;
format DT datetime20.;
DT='19Dec2009 00:00:00'dt;
do while (DT lt '05Jan2010 00:00:00'dt);
output;
DT=sum(DT,ranuni(1)*86400);
end;
run;

proc sql;
create view VSortedByIDAndWeekAndDate as
select *, datepart(intnx('dtweek.2',DT,0,'E')) format=weekdatx. as LastDayInWeek
from have
order by id,LastDayInWeek,DT
;
quit;

data want;
set VSortedByIDAndWeekAndDate;
by id LastDayInWeek DT;
if last.LastDayInWeek then output;
run;

proc print data=want;
run;


HTH
Patrick
Ask a Question
Discussion stats
  • 9 replies
  • 570 views
  • 0 likes
  • 5 in conversation