Help using Base SAS procedures

Get last observation from group

Reply
Super Contributor
Posts: 305

Get last observation from group

Dear all,

For the following sample dataset I need to retrieve the last value of c within each group (variable a) from the latest b.

data have;

input a $1-4 b c;

datalines;

M31 1 2

M31 3 9

M31 2 15

M98 5 4

M98 3 6

M98 4 2

;

run;

Therefore the output shall be M31 3 9 and M98 5 4.

I know the classic way of sort and last but I am curious if there is any other method whivh may run faster (my dataset is big...)

10x

Trusted Advisor
Posts: 1,204

Re: Get last observation from group

proc sql;

select * from have

group by a

having b=max(b);

quit;

Valued Guide
Posts: 858

Re: Get last observation from group

That's the only other way I would think to do it.  I'm curious to hear how the performance changed from sort / last.

Super User
Posts: 17,852

Re: Get last observation from group

I think there's a proc summary solution here but I've never been good at optimization problems.

Super User
Posts: 17,852

Re: Get last observation from group

proc summary data=have noprint;

class a;

id c;

var b;

output out=want max=;

run;

proc print data=want;

run;

Trusted Advisor
Posts: 1,131

Re: Get last observation from group

I agree with you Mark, sort followed by last. could also be a good choice.

data have;

input a $1-4 b c;

datalines;

M31 1 2

M31 3 9

M31 2 15

M98 5 4

M98 3 6

M98 4 2

;

run;

proc sort data=have;

by a b;

run;

data want;

set have;

by a b;

if last.a;

run;

Thanks,

Jag

Thanks,
Jag
Super User
Super User
Posts: 7,407

Re: Get last observation from group

Well, you don't need to sort the data first to get last.  Proc SQL obviously is quite good, however you may run into memory limits depending on the size of your data.  I would suggest adding in a last flag e.g:

data your_data;

     set your_data;

     by a;

     retain max_num max_num_res;

     if first.a then do;

          max_num=b;

          max_res=c;

     end;

     else do;

          if b > max_num then do;

               max_num=b;

               max_num_res=c;

          end;

     end;

     if last.a then output;

run;

Then you should have one row per group, with max_num and max_num_res holding the data you want.  You will have to test performance of course against SQL and Hash.

Respected Advisor
Posts: 3,124

Re: Get last observation from group

Some level of SORT will always be involved in this case, including SQL,Hash (implicit) or even the one offered by (explicit) Here is one example using Hash:

data have;

     input a $1-4 b c;

     datalines;

M31 1 2

M31 3 9

M31 2 15

M98 5 4

M98 3 6

M98 4 2

;

run;

data want;

     declare hash h(ordered:'a');

     h.definekey('b');

     h.definedata('b','c');

     h.definedone();

     declare hiter hi('h');

     do until (last.a);

           set have;

           by a notsorted;

           rc=h.add();

     end;

     rc=hi.last();

     output;

     drop rc;

run;

Ask a Question
Discussion stats
  • 7 replies
  • 3009 views
  • 9 likes
  • 7 in conversation