Hi,
I have 2 datasets with 'initial' date and several recorded dates for each ID. i want to be able to merge the dataset with the closest initial and recorded dates.
Data Have1:
ID | Initial |
1 | 1/31/2021 |
2 | 2/3/2021 |
3 | 1/4/2020 |
4 | 3/4/2019 |
5 | 3/5/2020 |
6 | 4/5/2019 |
data have1;
infile datalines dlm=',';
input ID Initial :mmddyy10.;
format Initial mmddyy10.;
datalines;
1,1/31/2021
2,2/3/2021
3,1/4/2020
4,3/4/2019
5,3/5/2020
6,4/5/2019
;
Data Have 2:
ID | Recorded |
1 | 1/28/2021 |
1 | 2/5/2020 |
1 | 5/20/2021 |
1 | 2/21/2019 |
2 | 5/15/2021 |
2 | 2/18/2021 |
2 | 2/15/2021 |
3 | 3/2/2019 |
3 | 1/1/2020 |
3 | 5/5/2020 |
3 | 4/5/2021 |
4 | 5/2/2021 |
4 | 4/25/2020 |
4 | 3/20/2019 |
4 | 2/6/2020 |
4 | 6/10/2020 |
5 | 2/25/2020 |
5 | 3/18/2020 |
6 | 5/5/2019 |
data have2;
infile datalines dlm=',';
input ID Recorded :mmddyy10.;
format Recorded mmddyy10.;
datalines;
1,1/28/2021
1,2/5/2020
1,5/20/2021
1,2/21/2019
2,5/15/2021
2,2/18/2021
2,2/15/2021
3,3/2/2019
3,1/1/2020
3,5/5/2020
3,4/5/2021
4,5/2/2021
4,4/25/2020
4,3/20/2019
4,2/6/2020
4,6/10/2020
5,2/25/2020
5,3/18/2020
6,5/5/2019
;
Data Want:
ID | Initial | Closest_Recorded |
1 | 1/31/2021 | 1/28/2021 |
2 | 2/3/2021 | 2/15/2021 |
3 | 1/4/2020 | 1/1/2020 |
4 | 3/4/2019 | 3/20/2019 |
5 | 3/5/2020 | 2/25/2020 |
6 | 4/5/2019 | 5/5/2019 |
how about this code?
proc transpose data=have2 out=have3(drop=_name_);
by id;
var recorded;
run;
data want;
merge have1 have3;
by id;
array col{*} col:;
_dif=abs(DATDIF(initial,col{1},'ACT/ACT'));
_pos=1;
do i=2 to dim(col);
if col{i} ne . then do;
_tmp=abs(DATDIF(initial,col{i},'ACT/ACT'));
if _tmp<_dif then do;
_dif=_tmp;
_pos=i;
end;
end;
end;
Recorded =col{_pos};
format Recorded mmddyy10.;
drop col: _: i;
run;
It could be done with SQL by taking the minimum of the absolute difference between dates and grouping by ID and is a lot less code than the array solution.
proc sql number;
select h1.id,initial, recorded as closest_recorded, abs(recorded-initial) as diff
from have1 as h1
left join have2 as h2
on h1.id=h2.id
group by h1.id, initial
having diff=min(diff);
quit;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.