I have a dataset that is set up as so:
Var1 Var2 Var3
1 A 1 January 2021
1 A 1 January 2021
1 A 5 January 2021
1 A 5 January 2021
1 B 31 Oct 2020
1 B 2 Nov 2020
My question is how can I ONLY keep the rows associated with the most recent values of Var2? That is, I want all the rows associated with 5 Jan 2021 for Var2=A, and all the rows associated with 2 Nov 2020 for Var2=B, etc.
A classic use-case for REMERGE and HAVING clause combine to subset-
proc sql;
create table want as
select *
from have
group by var1, var2
having max(var3)=var3;
quit;
A classic use-case for REMERGE and HAVING clause combine to subset-
proc sql;
create table want as
select *
from have
group by var1, var2
having max(var3)=var3;
quit;
Thank you!!
data have; input Var1 Var2 $ Var3 $40.; cards; 1 A 1 January 2021 1 A 1 January 2021 1 A 5 January 2021 1 A 5 January 2021 1 B 31 Oct 2020 1 B 2 Nov 2020 ; data want; do until(last.var3); set have; by var1 var2 var3 notsorted; if last.var2 then last=1; end; do until(last.var3); set have; by var1 var2 var3 notsorted; if last then output; end; drop last; run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.