Hi,
I have a database of 150 cancer patients. I calculated the expected disease assessment dates that occurs every 28 days for the next 3 years. The site performed disease assessment on expected dates but missed a few dates. i have my input in 2 different tables. Table 1: calculated disease assessment dates for 3 years. Table 2 : Site assessment dates. My output should capture the missing expected disease assessment dates/ incorrect assessment dates. If the date matches, it should give a blank column, else it should give the missing/incorrect date. I am trying out various techniques using loops and arrays but I end up getting huge output. Output should be restricted to one row per subject.
Table 1:
Subject pd_1 pd_2 pd_3 pd_4 ... pd_36
100 21NOV2013 19DEC2013 16JAN2014 3FEB2014
Table 2:
Subject Response_Date
100 2013-11-21
100 2013-12-19
100 2014-01-16
100 2014-02-14
100 2014-03-13
Output:
Subject pd_1 pd_2 pd_3 pd_4
100 03Feb2014
OK. if you like Hash Table.
data Table1; input Subject ( pd_1 pd_2 pd_3 pd_4 ) (:date9.); format pd_: date9.; cards; 100 21NOV2013 19DEC2013 16JAN2014 3FEB2014 ; run; data Table2; input Subject Response_Date : yymmdd10.; format Response_Date date9.; cards; 100 2013-11-21 100 2013-12-19 100 2014-01-16 100 2014-02-14 100 2014-03-13 ; run; data want(drop=i Response_Date); if _n_ eq 1 then do; if 0 then set table2; declare hash ha(dataset:'table2',hashexp:16); ha.definekey('Subject','Response_Date'); ha.definedone(); end; set table1; array x{*} pd_:; do i=1 to dim(x); Response_Date=x{i}; if ha.check()=0 then call missing(x{i}); end; run;
Xia Keshan
Thanks keshan. This was very helpful. I finally got what I expected. Also you have introduced me to new world of "Hash". This data step functionality is very versatile. Is there a possibility if I can add a window period of +/- 7 days to get the match. If my response date is within a +/- 7 days of pd_ dates, then it should not give me a record.
Sure . That is possible .
data Table1; input Subject ( pd_1 pd_2 pd_3 pd_4 ) (:date9.); format pd_: date9.; cards; 100 21NOV2013 19DEC2013 16JAN2014 3FEB2014 ; run; data Table2; input Subject Response_Date : yymmdd10.; format Response_Date date9.; cards; 100 2013-11-29 100 2013-12-19 100 2014-01-16 100 2014-02-14 100 2014-03-13 ; run; data want(drop=i Response_Date date); if _n_ eq 1 then do; if 0 then set table2; declare hash ha(dataset:'table2',hashexp:16); ha.definekey('Subject','Response_Date'); ha.definedone(); end; set table1; array x{*} pd_:; do i=1 to dim(x); do date=x{i}-7 to x{i}+7 ; Response_Date=date; if ha.check()=0 then call missing(x{i}); end; end; run;
Xia Keshan
Hope it helps. Please dont mind if it is not the way you want. Thanks
Thanks sandra. I will try this out and get back to you.
Hi,
A third possibility:
proc transpose data=table2 out=temp; /* assumes sorted */
by subject;
var response_date;
run;
data want (drop=i j col1-col5 _name_);
merge table1 temp;
by subject;
array pd_{5};
array col{5};
do i=1 to 5;
do j=1 to 5;
if pd_{i}=col{j} then pd_{i}=.;
end;
end;
format pd_1-pd_5 date9.;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need courses to help you with SAS Life Sciences Analytics Framework, SAS Health Cohort Builder, or other topics? Check out the Health and Life Sciences learning path for all of the offerings.