Hi,
I have a database of 150 cancer patients. I calculated the expected disease assessment dates that occurs every 28 days for the next 3 years. The site performed disease assessment on expected dates but missed a few dates. i have my input in 2 different tables. Table 1: calculated disease assessment dates for 3 years. Table 2 : Site assessment dates. My output should capture the missing expected disease assessment dates/ incorrect assessment dates. If the date matches, it should give a blank column, else it should give the missing/incorrect date. I am trying out various techniques using loops and arrays but I end up getting huge output. Output should be restricted to one row per subject.
Table 1:
Subject pd_1 pd_2 pd_3 pd_4 ... pd_36
100 21NOV2013 19DEC2013 16JAN2014 3FEB2014
Table 2:
Subject Response_Date
100 2013-11-21
100 2013-12-19
100 2014-01-16
100 2014-02-14
100 2014-03-13
Output:
Subject pd_1 pd_2 pd_3 pd_4
100 03Feb2014
OK. if you like Hash Table.
data Table1; input Subject ( pd_1 pd_2 pd_3 pd_4 ) (:date9.); format pd_: date9.; cards; 100 21NOV2013 19DEC2013 16JAN2014 3FEB2014 ; run; data Table2; input Subject Response_Date : yymmdd10.; format Response_Date date9.; cards; 100 2013-11-21 100 2013-12-19 100 2014-01-16 100 2014-02-14 100 2014-03-13 ; run; data want(drop=i Response_Date); if _n_ eq 1 then do; if 0 then set table2; declare hash ha(dataset:'table2',hashexp:16); ha.definekey('Subject','Response_Date'); ha.definedone(); end; set table1; array x{*} pd_:; do i=1 to dim(x); Response_Date=x{i}; if ha.check()=0 then call missing(x{i}); end; run;
Xia Keshan
Thanks keshan. This was very helpful. I finally got what I expected. Also you have introduced me to new world of "Hash". This data step functionality is very versatile. Is there a possibility if I can add a window period of +/- 7 days to get the match. If my response date is within a +/- 7 days of pd_ dates, then it should not give me a record.
Sure . That is possible .
data Table1; input Subject ( pd_1 pd_2 pd_3 pd_4 ) (:date9.); format pd_: date9.; cards; 100 21NOV2013 19DEC2013 16JAN2014 3FEB2014 ; run; data Table2; input Subject Response_Date : yymmdd10.; format Response_Date date9.; cards; 100 2013-11-29 100 2013-12-19 100 2014-01-16 100 2014-02-14 100 2014-03-13 ; run; data want(drop=i Response_Date date); if _n_ eq 1 then do; if 0 then set table2; declare hash ha(dataset:'table2',hashexp:16); ha.definekey('Subject','Response_Date'); ha.definedone(); end; set table1; array x{*} pd_:; do i=1 to dim(x); do date=x{i}-7 to x{i}+7 ; Response_Date=date; if ha.check()=0 then call missing(x{i}); end; end; run;
Xia Keshan
Hope it helps. Please dont mind if it is not the way you want. Thanks
Thanks sandra. I will try this out and get back to you.
Hi,
A third possibility:
proc transpose data=table2 out=temp; /* assumes sorted */
by subject;
var response_date;
run;
data want (drop=i j col1-col5 _name_);
merge table1 temp;
by subject;
array pd_{5};
array col{5};
do i=1 to 5;
do j=1 to 5;
if pd_{i}=col{j} then pd_{i}=.;
end;
end;
format pd_1-pd_5 date9.;
run;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Join us for two new fee-based courses: Administrative Healthcare Data and SAS via Live Web Monday-Thursday, April 24-27 from 1:00 to 4:30 PM ET each day. And Administrative Healthcare Data and SAS: Hands-On Programming Workshop via Live Web on Friday, April 28 from 9:00 AM to 5:00 PM ET.