Solved
Contributor
Posts: 39

# Keeping the latest by group?

/*Hi SAS Forum,

I have posted this same question in another forum but due to urgency I posted it here too (apologies).

I have over million records dataset like this.

In this, same cust_num (that is 1) together with same acct_num (that is 111) occur three times

and make three by-groups.

And there is a single other by group related to acct 222.*/

data have;

input cust_num acct_num date score_ID;

cards;

1 111 20150324 40000

1 111 20150324 50000

1 111 20150324 60000

1 111 20150324 70000

1 111 20150324 80000

1 111 20150402 40000

1 111 20150402 50000

1 111 20150402 60000

1 111 20150402 70000

1 111 20150402 80000

1 111 20150425 40000

1 111 20150425 50000

1 111 20150425 60000

1 111 20150425 70000

1 111 20150425 80000

2 222 20180302 70000

2 222 20180302 90000

;

run;

/*Q: I need to keep only the latest by group.

Answer data set should be this because 20150425 is the latest date out of the three by groups for acct 111.

And three is a single other by group for acct 222.

1 111 20150425 40000

1 111 20150425 50000

1 111 20150425 60000

1 111 20150425 70000

1 111 20150425 80000

2 222 20180302 70000

2 222 20180302 90000

Could you please help me to do the coding. We should not use any hard coding like

if date = 20150425;

because I do not know what are the latest dates in other million records

Mirisa

*/

Accepted Solutions
Solution
‎04-25-2018 11:21 PM
Super User
Posts: 2,049

## Re: Keeping the latest by group?

[ Edited ]
``````data have;

input cust_num acct_num date :yymmdd10. score_ID;
format date yymmdd10.;
cards;
1 111 20150324 40000
1 111 20150324 50000
1 111 20150324 60000
1 111 20150324 70000
1 111 20150324 80000
1 111 20150402 40000
1 111 20150402 50000
1 111 20150402 60000
1 111 20150402 70000
1 111 20150402 80000
1 111 20150425 40000
1 111 20150425 50000
1 111 20150425 60000
1 111 20150425 70000
1 111 20150425 80000
2 222 20180302 70000
2 222 20180302 90000
;

run;

proc sql;
create table want as
select *
from have
group by cust_num, acct_num
having date=max(date);
quit;``````

All Replies
Solution
‎04-25-2018 11:21 PM
Super User
Posts: 2,049

## Re: Keeping the latest by group?

[ Edited ]
``````data have;

input cust_num acct_num date :yymmdd10. score_ID;
format date yymmdd10.;
cards;
1 111 20150324 40000
1 111 20150324 50000
1 111 20150324 60000
1 111 20150324 70000
1 111 20150324 80000
1 111 20150402 40000
1 111 20150402 50000
1 111 20150402 60000
1 111 20150402 70000
1 111 20150402 80000
1 111 20150425 40000
1 111 20150425 50000
1 111 20150425 60000
1 111 20150425 70000
1 111 20150425 80000
2 222 20180302 70000
2 222 20180302 90000
;

run;

proc sql;
create table want as
select *
from have
group by cust_num, acct_num
having date=max(date);
quit;``````
☑ This topic is solved.