Desktop productivity for business analysts and programmers

Bring back the first of two dates

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 16
Accepted Solution

Bring back the first of two dates

Hi I have a large dataset consisting of millions of rows, I am trying to bring back the Ref  minimum of post_date and minimum of from_date. what is the best way this can be done? See example in table below.

 

 Table_Test  
RefPost_DateFrom_DateTo_Date
101/01/201727/12/201606/01/2017
101/01/201707/01/201717/01/2017
102/02/201718/01/201728/01/2017
201/02/201627/01/201601/02/2016
201/02/201602/02/201607/02/2016
203/04/201608/02/201613/02/2016
203/05/201614/02/201619/02/2016
310/04/201505/04/201515/04/2015
311/09/201506/09/201516/09/2015
404/02/201430/01/201409/02/2014
404/02/201610/02/201420/02/2014
401/01/201727/12/201606/01/2017

Accepted Solutions
Solution
‎04-25-2017 07:45 AM
Super User
Posts: 7,371

Re: Bring back the first of two dates

Method 1: SQL

proc sql;
create table want as
  select ref, min(post_date) as min_post_date, min(from_date) as min_from_date
  from have
  group by ref
;
quit;

Method 2: data step

proc sort data=have;
by ref;
run;
/* omit this step if have is already sorted, as in your example */
data want (keep=ref min_post_date min_from_date);
set have;
by ref;
retain min_post_date min_from_date;
format min_post_date min_from_date ddmmyy10.;
if first.ref
then do;
  min_post_date = '31dec9999'd;
  min_form_date = '31dec9999'd;
end;
min_post_date = min(min_post_date,post_date);
min_from_date = min(min_from_date,from_date);
if last.ref then output;
run;

Method 3: proc means

proc means data=have noprint;
by ref;
var post_date from_date;
output
  out=want (drop=_type_ _freq_)
  min(post_date)=min_post_date
  min(from_date)=min_from_date
;
run;
/* also needs sorting */

If the cardinality of ref is sufficiently low, you can use class instead of by in the proc means, and avoid sorting.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Solution
‎04-25-2017 07:45 AM
Super User
Posts: 7,371

Re: Bring back the first of two dates

Method 1: SQL

proc sql;
create table want as
  select ref, min(post_date) as min_post_date, min(from_date) as min_from_date
  from have
  group by ref
;
quit;

Method 2: data step

proc sort data=have;
by ref;
run;
/* omit this step if have is already sorted, as in your example */
data want (keep=ref min_post_date min_from_date);
set have;
by ref;
retain min_post_date min_from_date;
format min_post_date min_from_date ddmmyy10.;
if first.ref
then do;
  min_post_date = '31dec9999'd;
  min_form_date = '31dec9999'd;
end;
min_post_date = min(min_post_date,post_date);
min_from_date = min(min_from_date,from_date);
if last.ref then output;
run;

Method 3: proc means

proc means data=have noprint;
by ref;
var post_date from_date;
output
  out=want (drop=_type_ _freq_)
  min(post_date)=min_post_date
  min(from_date)=min_from_date
;
run;
/* also needs sorting */

If the cardinality of ref is sufficiently low, you can use class instead of by in the proc means, and avoid sorting.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 16

Re: Bring back the first of two dates

Thanks for your help Kurt, much appreciated

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 180 views
  • 0 likes
  • 2 in conversation