Hi,
I have a dataset as below
ID | Col1 | Col2 | Mon | Year |
101 | X | Y | 1 | 2021 |
101 | X | Y | 2 | 2021 |
101 | X | Y | 2 | 2021 |
101 | X | Y | 3 | 2021 |
101 | X | Z | 4 | 2021 |
101 | Z | Y | 4 | 2021 |
101 | X | Y | 5 | 2021 |
102 | X | Y | 4 | 2021 |
102 | Z | Y | 4 | 2021 |
102 | A | B | 5 | 2021 |
I want the latest col1 and col2 values based on the latest month and year to be populated. Like below.
ID | Col1 | Col2 | Col1_2 | Col2_2 | Mon | Year |
101 | X | Y | X | Y | 1 | 2021 |
101 | X | Y | X | Y | 2 | 2021 |
101 | X | Y | X | Y | 2 | 2021 |
101 | X | Y | X | Y | 3 | 2021 |
101 | X | Z | X | Y | 4 | 2021 |
101 | Z | Y | X | Y | 4 | 2021 |
101 | X | Y | X | Y | 5 | 2021 |
102 | X | Y | A | B | 4 | 2021 |
102 | Z | Y | A | B | 4 | 2021 |
102 | A | B | A | B | 5 | 2021 |
Any suggestions. Thanks in advance
Welcome to the communities!
Given that your dataset is already sorted, a double DO loop does this quite nicely:
data want;
do until (last.id);
set have;
by id;
end;
col1_2 = col1;
col2_2 = col2;
do until (last.id);
set have;
by id;
output;
end;
run;
Welcome to the communities!
Given that your dataset is already sorted, a double DO loop does this quite nicely:
data want;
do until (last.id);
set have;
by id;
end;
col1_2 = col1;
col2_2 = col2;
do until (last.id);
set have;
by id;
output;
end;
run;
Using a SQL:
proc sql;
create table want as
select
l.*,
r.col1_2,
r.col2_2
from
have l
left join
( select id, col1 as col1_2, col2 as col2_2
from have
group by id, year
having max(mon)=mon
) r
on l.id=r.id
order by id, year, mon
;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.