Hi all,
I have several dates for IDs and i need to choose the last and second last dates. I know how to get the last date but i am having trouble trying to find the second last. For example for ID#2: 4/20/2020 is the last date and the date previous to the last date is 2/17/2020.
Have:
ID | d_date |
1 | 11/1/2019 |
1 | 5/3/2020 |
2 | 8/31/2019 |
2 | 9/2/2019 |
2 | 2/17/2020 |
2 | 4/20/2020 |
2 | 5/18/2020 |
2 | 12/17/2019 |
2 | 10/13/2019 |
3 | 7/10/2020 |
3 | 4/3/2020 |
3 | 12/27/2019 |
4 | 9/23/2019 |
4 | 9/23/2019 |
4 | 6/21/2020 |
4 | 9/20/2020 |
4 | 2/24/2020 |
Data Want:
ID | D_last | D_secondLast |
1 | 5/3/2020 | 11/1/2019 |
2 | 4/20/2020 | 2/17/2020 |
3 | 7/10/2020 | 4/3/2020 |
4 | 9/20/2020 | 6/21/2020 |
Hi @newsas007 It appears the last date for ID2 is 5/18/2020. Can you please review your expected result. Otherwise it's simple-
data have;
input ID d_date :mmddyy10.;
format d_date mmddyy10.;
cards;
1 11/1/2019
1 5/3/2020
2 8/31/2019
2 9/2/2019
2 2/17/2020
2 4/20/2020
2 5/18/2020
2 12/17/2019
2 10/13/2019
3 7/10/2020
3 4/3/2020
3 12/27/2019
4 9/23/2019
4 9/23/2019
4 6/21/2020
4 9/20/2020
4 2/24/2020
;
data Want;
do _n_=1 by 1 until(last.id);
set have;
by id;
array t(99999) _temporary_;
t(_n_)=d_date;
end;
d_last=max(of t(*));
d_2nd_last=largest(2,of t(*));
call missing(of t(*));
format d_last d_2nd_last mmddyy10.;
drop d_date;
run;
One way: sort by Id and DESCENDING date. The the first and second are the ones you want.
Run a counter that is reset for each ID.
While the counter is less than or equal to 2 do what you need and then output at count=2 (or last of the ID in case an ID only has one).
Hi @newsas007 It appears the last date for ID2 is 5/18/2020. Can you please review your expected result. Otherwise it's simple-
data have;
input ID d_date :mmddyy10.;
format d_date mmddyy10.;
cards;
1 11/1/2019
1 5/3/2020
2 8/31/2019
2 9/2/2019
2 2/17/2020
2 4/20/2020
2 5/18/2020
2 12/17/2019
2 10/13/2019
3 7/10/2020
3 4/3/2020
3 12/27/2019
4 9/23/2019
4 9/23/2019
4 6/21/2020
4 9/20/2020
4 2/24/2020
;
data Want;
do _n_=1 by 1 until(last.id);
set have;
by id;
array t(99999) _temporary_;
t(_n_)=d_date;
end;
d_last=max(of t(*));
d_2nd_last=largest(2,of t(*));
call missing(of t(*));
format d_last d_2nd_last mmddyy10.;
drop d_date;
run;
This is a case where the use of arrays in which you carefully name two of the array elements, plus a call sortn statement, can simplify:
data have;
input ID d_date :mmddyy10.;
format d_date mmddyy10.;
cards;
1 11/1/2019
1 5/3/2020
2 8/31/2019
2 9/2/2019
2 2/17/2020
2 4/20/2020
2 5/18/2020
2 12/17/2019
2 10/13/2019
3 7/10/2020
3 4/3/2020
3 12/27/2019
4 9/23/2019
4 9/23/2019
4 6/21/2020
4 9/20/2020
4 2/24/2020
;
data want (keep=id d_last d_second_last);
do i=1 by 1 until (last.id);
set have ;
by id ;
format d_last d_second_last mmddyy10.;
array dats{20} d1-d18 d_second_last d_last;
dats{i}=d_date;
end;
call sortn(of dats{*});
run;
Just give the array a dimension (20 in the above case) that is large enough to accommodate the most populated id. And when you name the variables belonging to the array, make the last two names d_second_last and d_last, preceded by two fewer dummy names than the array size, d1-d18 here.
The call sortn will put the largest two date values in the final two elements of the array, conveniently pre-named as the desired result variables. No further assignment required.
I put the format statement and the array statement in the given locations just to force the sas compiler to make ID the leftmost variable followed by D_LAST and D_SECOND_LAST. You could put it anywhere. And you could put the array statement anywhere prior to the "dats{i}=" assignment statement.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.