Dear all,
For the following sample dataset I need to retrieve the last value of c within each group (variable a) from the latest b.
data have;
input a $1-4 b c;
datalines;
M31 1 2
M31 3 9
M31 2 15
M98 5 4
M98 3 6
M98 4 2
;
run;
Therefore the output shall be M31 3 9 and M98 5 4.
I know the classic way of sort and last but I am curious if there is any other method whivh may run faster (my dataset is big...)
10x
proc sql;
select * from have
group by a
having b=max(b);
quit;
That's the only other way I would think to do it. I'm curious to hear how the performance changed from sort / last.
I think there's a proc summary solution here but I've never been good at optimization problems.
proc summary data=have noprint;
class a;
id c;
var b;
output out=want max=;
run;
proc print data=want;
run;
I agree with you Mark, sort followed by last. could also be a good choice.
data have;
input a $1-4 b c;
datalines;
M31 1 2
M31 3 9
M31 2 15
M98 5 4
M98 3 6
M98 4 2
;
run;
proc sort data=have;
by a b;
run;
data want;
set have;
by a b;
if last.a;
run;
Thanks,
Jag
Well, you don't need to sort the data first to get last. Proc SQL obviously is quite good, however you may run into memory limits depending on the size of your data. I would suggest adding in a last flag e.g:
data your_data;
set your_data;
by a;
retain max_num max_num_res;
if first.a then do;
max_num=b;
max_res=c;
end;
else do;
if b > max_num then do;
max_num=b;
max_num_res=c;
end;
end;
if last.a then output;
run;
Then you should have one row per group, with max_num and max_num_res holding the data you want. You will have to test performance of course against SQL and Hash.
Some level of SORT will always be involved in this case, including SQL,Hash (implicit) or even the one offered by (explicit) Here is one example using Hash:
data have;
input a $1-4 b c;
datalines;
M31 1 2
M31 3 9
M31 2 15
M98 5 4
M98 3 6
M98 4 2
;
run;
data want;
declare hash h(ordered:'a');
h.definekey('b');
h.definedata('b','c');
h.definedone();
declare hiter hi('h');
do until (last.a);
set have;
by a notsorted;
rc=h.add();
end;
rc=hi.last();
output;
drop rc;
run;
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.