I have a wide dataset with 6 dates and 6 types. Each type number corresponds to the date number. I need to compare each date and type to each other (2 at a time) and find the earliest date that these requirements are met:
Criteria
If both types = P, the difference between the dates must be 17 or more.
All other combinations must be 24 or more days apart
I have to find the earliest date match. So in my example, my "have" dataset has all 6 entries. My want dataset has the earliest two dates that meet my criteria which are 2 & 4. I can calculate this manually but can't figure out how to do it in SAS. I would love to have this in some sort of iterative macro program instead of taking up hundreds of lines of code.
My logic on paper
1. Calculate # of days between each admin_date (all permutations)
2. Ignore any that are not at least 17 days apart
3. Among those that are >= 17 and <24, check to see if both types are P. If so, take the combination with the earliest 2nd date.
4. If not, check those that are >= 24. Take the combination with the earliest 2nd date.
My calculations for Day Differences Between:
1 & 2: 1 day (ignore, less than 17)
1 & 3: 7 days (ignore, less than 17)
1 & 4: 18 days (ignore, not both P)
1 & 5: 20 days (ignore, not both P)
1 & 6: 34 days (consider, ge 24)
2 & 3: 6 days (ignore, less than 17)
2 & 4: 17 days (consider, ge 17, both P)
2 & 5: 19 days (consider, ge 17, both P)
2 & 6: 33 days (consider, ge 24)
and so forth
Among all of the sets that met the criteria: 1 and 6, 2 and 4, 2 and 5, 2 and 6. 4 is the earliest so that's what I want.
data have; input person $ admin_date1 : ?? mmddyy10. admin_date2 : ??mmddyy10. admin_date3 : ?? mmddyy10. admin_date4 : ?? mmddyy10. admin_date5 : ?? mmddyy10. admin_date6 : ?? mmddyy10. type1 $ type2 $ type3 $ type4 $ type5 $ type6 $; format admin_date1 mmddyy10. admin_date2 mmddyy10. admin_date3 mmddyy10. admin_date4 mmddyy10. admin_date5 mmddyy10. admin_date6 mmddyy10.; datalines; JohnDoe 01/12/2021 01/13/2021 01/19/2021 01/30/2021 02/01/2021 02/15/2021 M P M P P J ; run; data want; input person $ admin_date1 : ?? mmddyy10. admin_date2 : ??mmddyy10. type1 $ type2 $ ; format admin_date1 mmddyy10. admin_date2 mmddyy10. ; datalines; JohnDoe 01/13/2021 01/30/2021 P P ; run;
This is clunky but what I figured out on my own and works.
data have;
input person $
admin_date_1 : ?? mmddyy10.
admin_date_2 : ??mmddyy10.
admin_date_3 : ?? mmddyy10.
admin_date_4 : ?? mmddyy10.
admin_date_5 : ?? mmddyy10.
admin_date_6 : ?? mmddyy10.
type_1 $
type_2 $
type_3 $
type_4 $
type_5 $
type_6 $;
format admin_date_1 mmddyy10.
admin_date_2 mmddyy10.
admin_date_3 mmddyy10.
admin_date_4 mmddyy10.
admin_date_5 mmddyy10.
admin_date_6 mmddyy10.;
datalines;
JohnDoe 01/12/2021 01/13/2021 01/19/2021 01/30/2021 02/01/2021 02/15/2021 M P M P P J
;
run;
*in order for this to work correctly, you have to order the DAYS_BT_DOSE fields in ascending order like below in the format; %macro test; data want (keep=person new:); set have; format L_DAYS_BT_DOSE_1_2 L_DAYS_BT_DOSE_1_3 L_DAYS_BT_DOSE_2_3 L_DAYS_BT_DOSE_1_4 L_DAYS_BT_DOSE_2_4 L_DAYS_BT_DOSE_3_4 L_DAYS_BT_DOSE_1_5 L_DAYS_BT_DOSE_2_5 L_DAYS_BT_DOSE_3_5 L_DAYS_BT_DOSE_4_5 L_DAYS_BT_DOSE_1_6 L_DAYS_BT_DOSE_2_6 L_DAYS_BT_DOSE_3_6 L_DAYS_BT_DOSE_4_6 L_DAYS_BT_DOSE_5_6 F_DAYS_BT_DOSE_1_2 F_DAYS_BT_DOSE_1_3 F_DAYS_BT_DOSE_2_3 F_DAYS_BT_DOSE_1_4 F_DAYS_BT_DOSE_2_4 F_DAYS_BT_DOSE_3_4 F_DAYS_BT_DOSE_1_5 F_DAYS_BT_DOSE_2_5 F_DAYS_BT_DOSE_3_5 F_DAYS_BT_DOSE_4_5 F_DAYS_BT_DOSE_1_6 F_DAYS_BT_DOSE_2_6 F_DAYS_BT_DOSE_3_6 F_DAYS_BT_DOSE_4_6 F_DAYS_BT_DOSE_5_6 NEW_ADMIN_DT_1 NEW_ADMIN_DT_2 MMDDYY10. ; %do i = 1 %to 5; %let nxt = %eval(&i. + 1); %do b=&nxt %to 6; if TYPE_&i = 'P' and TYPE_&b = 'P' and intck('day',ADMIN_DATE_&i,ADMIN_DATE_&b) ge 17 then do; L_DAYS_BT_DOSE_&i._&b = ADMIN_DATE_&b; F_DAYS_BT_DOSE_&i._&b = ADMIN_DATE_&i; end; else if TYPE_&i in ('P','M','J') and TYPE_&b in ('P','M','J') and intck('day',ADMIN_DATE_&i,ADMIN_DATE_&b) ge 24 then do; L_DAYS_BT_DOSE_&i._&b = ADMIN_DATE_&b; F_DAYS_BT_DOSE_&i._&b = ADMIN_DATE_&i; end; %end; %end; NEW_ADMIN_DT_1 = COALESCE(of F_DAYS_BT_DOSE:); NEW_ADMIN_DT_2 = COALESCE(of L_DAYS_BT_DOSE:); run; %mend test; *execute macro; %test;
data test1;
set have;
array a admin_date1-admin_date6;
array b admin_date1-admin_date6 ;
array x $ type1-type6;
array y $ type1-type6;
array v(5,6) $10 a1b1-a1b6 a2b1-a2b6 a3b1-a3b6 a4b1-a4b6 a5b1-a5b6;
do i=1 to 5;
do j=i+1 to 6;
/* case 1: If both types = P, the difference between the dates must be 17 or more.
case 2: All other combinations must be 24 or more days apart*/
if (x(i)='P' and y(j)='P' and b(j)-a(i)>=17)
or ((x(i)^='P' or y(j)^='P') and b(j)-a(i)>=24) then
do;
v(i,j)='Y'; output;
end;
end;
end;
run;
data want;
set test1;
format admin_date_1-admin_date_6 mmddyy10.;
array a admin_date1-admin_date6;
array b admin_date_1-admin_date_6 ;
array x $ type1-type6;
array y $ type_1-type_6;
array v(5,6) $10 a1b1-a1b6 a2b1-a2b6 a3b1-a3b6 a4b1-a4b6 a5b1-a5b6;
do n=1 to 5;
do m=n+1 to 6;
if v(n,m)='Y' then do; b(n)=a(n); y(m)=x(m); end;
end;
end;
keep person i j admin_date_1-admin_date_6 type_1-type_6;
run;
This is clunky but what I figured out on my own and works.
data have;
input person $
admin_date_1 : ?? mmddyy10.
admin_date_2 : ??mmddyy10.
admin_date_3 : ?? mmddyy10.
admin_date_4 : ?? mmddyy10.
admin_date_5 : ?? mmddyy10.
admin_date_6 : ?? mmddyy10.
type_1 $
type_2 $
type_3 $
type_4 $
type_5 $
type_6 $;
format admin_date_1 mmddyy10.
admin_date_2 mmddyy10.
admin_date_3 mmddyy10.
admin_date_4 mmddyy10.
admin_date_5 mmddyy10.
admin_date_6 mmddyy10.;
datalines;
JohnDoe 01/12/2021 01/13/2021 01/19/2021 01/30/2021 02/01/2021 02/15/2021 M P M P P J
;
run;
*in order for this to work correctly, you have to order the DAYS_BT_DOSE fields in ascending order like below in the format; %macro test; data want (keep=person new:); set have; format L_DAYS_BT_DOSE_1_2 L_DAYS_BT_DOSE_1_3 L_DAYS_BT_DOSE_2_3 L_DAYS_BT_DOSE_1_4 L_DAYS_BT_DOSE_2_4 L_DAYS_BT_DOSE_3_4 L_DAYS_BT_DOSE_1_5 L_DAYS_BT_DOSE_2_5 L_DAYS_BT_DOSE_3_5 L_DAYS_BT_DOSE_4_5 L_DAYS_BT_DOSE_1_6 L_DAYS_BT_DOSE_2_6 L_DAYS_BT_DOSE_3_6 L_DAYS_BT_DOSE_4_6 L_DAYS_BT_DOSE_5_6 F_DAYS_BT_DOSE_1_2 F_DAYS_BT_DOSE_1_3 F_DAYS_BT_DOSE_2_3 F_DAYS_BT_DOSE_1_4 F_DAYS_BT_DOSE_2_4 F_DAYS_BT_DOSE_3_4 F_DAYS_BT_DOSE_1_5 F_DAYS_BT_DOSE_2_5 F_DAYS_BT_DOSE_3_5 F_DAYS_BT_DOSE_4_5 F_DAYS_BT_DOSE_1_6 F_DAYS_BT_DOSE_2_6 F_DAYS_BT_DOSE_3_6 F_DAYS_BT_DOSE_4_6 F_DAYS_BT_DOSE_5_6 NEW_ADMIN_DT_1 NEW_ADMIN_DT_2 MMDDYY10. ; %do i = 1 %to 5; %let nxt = %eval(&i. + 1); %do b=&nxt %to 6; if TYPE_&i = 'P' and TYPE_&b = 'P' and intck('day',ADMIN_DATE_&i,ADMIN_DATE_&b) ge 17 then do; L_DAYS_BT_DOSE_&i._&b = ADMIN_DATE_&b; F_DAYS_BT_DOSE_&i._&b = ADMIN_DATE_&i; end; else if TYPE_&i in ('P','M','J') and TYPE_&b in ('P','M','J') and intck('day',ADMIN_DATE_&i,ADMIN_DATE_&b) ge 24 then do; L_DAYS_BT_DOSE_&i._&b = ADMIN_DATE_&b; F_DAYS_BT_DOSE_&i._&b = ADMIN_DATE_&i; end; %end; %end; NEW_ADMIN_DT_1 = COALESCE(of F_DAYS_BT_DOSE:); NEW_ADMIN_DT_2 = COALESCE(of L_DAYS_BT_DOSE:); run; %mend test; *execute macro; %test;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.