I have a dataset that is set up as so:
Var1 Var2 Var3
1 A 1 January 2021
1 A 1 January 2021
1 A 5 January 2021
1 A 5 January 2021
1 B 31 Oct 2020
1 B 2 Nov 2020
My question is how can I ONLY keep the rows associated with the most recent values of Var2? That is, I want all the rows associated with 5 Jan 2021 for Var2=A, and all the rows associated with 2 Nov 2020 for Var2=B, etc.
A classic use-case for REMERGE and HAVING clause combine to subset-
proc sql;
create table want as
select *
from have
group by var1, var2
having max(var3)=var3;
quit;
A classic use-case for REMERGE and HAVING clause combine to subset-
proc sql;
create table want as
select *
from have
group by var1, var2
having max(var3)=var3;
quit;
Thank you!!
data have; input Var1 Var2 $ Var3 $40.; cards; 1 A 1 January 2021 1 A 1 January 2021 1 A 5 January 2021 1 A 5 January 2021 1 B 31 Oct 2020 1 B 2 Nov 2020 ; data want; do until(last.var3); set have; by var1 var2 var3 notsorted; if last.var2 then last=1; end; do until(last.var3); set have; by var1 var2 var3 notsorted; if last then output; end; drop last; run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.