- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Posted 09-07-2021 11:37 AM
(1554 views)
Hi,
I have 2 datasets with 'initial' date and several recorded dates for each ID. i want to be able to merge the dataset with the closest initial and recorded dates.
Data Have1:
ID | Initial |
1 | 1/31/2021 |
2 | 2/3/2021 |
3 | 1/4/2020 |
4 | 3/4/2019 |
5 | 3/5/2020 |
6 | 4/5/2019 |
data have1;
infile datalines dlm=',';
input ID Initial :mmddyy10.;
format Initial mmddyy10.;
datalines;
1,1/31/2021
2,2/3/2021
3,1/4/2020
4,3/4/2019
5,3/5/2020
6,4/5/2019
;
Data Have 2:
ID | Recorded |
1 | 1/28/2021 |
1 | 2/5/2020 |
1 | 5/20/2021 |
1 | 2/21/2019 |
2 | 5/15/2021 |
2 | 2/18/2021 |
2 | 2/15/2021 |
3 | 3/2/2019 |
3 | 1/1/2020 |
3 | 5/5/2020 |
3 | 4/5/2021 |
4 | 5/2/2021 |
4 | 4/25/2020 |
4 | 3/20/2019 |
4 | 2/6/2020 |
4 | 6/10/2020 |
5 | 2/25/2020 |
5 | 3/18/2020 |
6 | 5/5/2019 |
data have2;
infile datalines dlm=',';
input ID Recorded :mmddyy10.;
format Recorded mmddyy10.;
datalines;
1,1/28/2021
1,2/5/2020
1,5/20/2021
1,2/21/2019
2,5/15/2021
2,2/18/2021
2,2/15/2021
3,3/2/2019
3,1/1/2020
3,5/5/2020
3,4/5/2021
4,5/2/2021
4,4/25/2020
4,3/20/2019
4,2/6/2020
4,6/10/2020
5,2/25/2020
5,3/18/2020
6,5/5/2019
;
Data Want:
ID | Initial | Closest_Recorded |
1 | 1/31/2021 | 1/28/2021 |
2 | 2/3/2021 | 2/15/2021 |
3 | 1/4/2020 | 1/1/2020 |
4 | 3/4/2019 | 3/20/2019 |
5 | 3/5/2020 | 2/25/2020 |
6 | 4/5/2019 | 5/5/2019 |
2 REPLIES 2
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
how about this code?
proc transpose data=have2 out=have3(drop=_name_);
by id;
var recorded;
run;
data want;
merge have1 have3;
by id;
array col{*} col:;
_dif=abs(DATDIF(initial,col{1},'ACT/ACT'));
_pos=1;
do i=2 to dim(col);
if col{i} ne . then do;
_tmp=abs(DATDIF(initial,col{i},'ACT/ACT'));
if _tmp<_dif then do;
_dif=_tmp;
_pos=i;
end;
end;
end;
Recorded =col{_pos};
format Recorded mmddyy10.;
drop col: _: i;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It could be done with SQL by taking the minimum of the absolute difference between dates and grouping by ID and is a lot less code than the array solution.
proc sql number;
select h1.id,initial, recorded as closest_recorded, abs(recorded-initial) as diff
from have1 as h1
left join have2 as h2
on h1.id=h2.id
group by h1.id, initial
having diff=min(diff);
quit;