BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
zdassu
Quartz | Level 8

Hi I have a large dataset consisting of millions of rows, I am trying to bring back the Ref  minimum of post_date and minimum of from_date. what is the best way this can be done? See example in table below.

 

 Table_Test  
RefPost_DateFrom_DateTo_Date
101/01/201727/12/201606/01/2017
101/01/201707/01/201717/01/2017
102/02/201718/01/201728/01/2017
201/02/201627/01/201601/02/2016
201/02/201602/02/201607/02/2016
203/04/201608/02/201613/02/2016
203/05/201614/02/201619/02/2016
310/04/201505/04/201515/04/2015
311/09/201506/09/201516/09/2015
404/02/201430/01/201409/02/2014
404/02/201610/02/201420/02/2014
401/01/201727/12/201606/01/2017
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Method 1: SQL

proc sql;
create table want as
  select ref, min(post_date) as min_post_date, min(from_date) as min_from_date
  from have
  group by ref
;
quit;

Method 2: data step

proc sort data=have;
by ref;
run;
/* omit this step if have is already sorted, as in your example */
data want (keep=ref min_post_date min_from_date);
set have;
by ref;
retain min_post_date min_from_date;
format min_post_date min_from_date ddmmyy10.;
if first.ref
then do;
  min_post_date = '31dec9999'd;
  min_form_date = '31dec9999'd;
end;
min_post_date = min(min_post_date,post_date);
min_from_date = min(min_from_date,from_date);
if last.ref then output;
run;

Method 3: proc means

proc means data=have noprint;
by ref;
var post_date from_date;
output
  out=want (drop=_type_ _freq_)
  min(post_date)=min_post_date
  min(from_date)=min_from_date
;
run;
/* also needs sorting */

If the cardinality of ref is sufficiently low, you can use class instead of by in the proc means, and avoid sorting.

View solution in original post

2 REPLIES 2
Kurt_Bremser
Super User

Method 1: SQL

proc sql;
create table want as
  select ref, min(post_date) as min_post_date, min(from_date) as min_from_date
  from have
  group by ref
;
quit;

Method 2: data step

proc sort data=have;
by ref;
run;
/* omit this step if have is already sorted, as in your example */
data want (keep=ref min_post_date min_from_date);
set have;
by ref;
retain min_post_date min_from_date;
format min_post_date min_from_date ddmmyy10.;
if first.ref
then do;
  min_post_date = '31dec9999'd;
  min_form_date = '31dec9999'd;
end;
min_post_date = min(min_post_date,post_date);
min_from_date = min(min_from_date,from_date);
if last.ref then output;
run;

Method 3: proc means

proc means data=have noprint;
by ref;
var post_date from_date;
output
  out=want (drop=_type_ _freq_)
  min(post_date)=min_post_date
  min(from_date)=min_from_date
;
run;
/* also needs sorting */

If the cardinality of ref is sufficiently low, you can use class instead of by in the proc means, and avoid sorting.

zdassu
Quartz | Level 8

Thanks for your help Kurt, much appreciated

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 605 views
  • 0 likes
  • 2 in conversation