BookmarkSubscribeRSS Feed
banoor
Calcite | Level 5

Hi,

I have a database of 150 cancer patients. I calculated the expected disease assessment dates that occurs every 28 days for the next 3 years. The site performed disease assessment on expected dates but missed a few dates. i have my input in 2 different tables. Table 1: calculated disease assessment dates for 3 years. Table 2 : Site assessment dates. My output should capture the missing expected disease assessment dates/ incorrect assessment dates. If the date matches, it should give a blank column, else it should give the missing/incorrect date. I am trying out various techniques using loops and arrays but I end up getting huge output. Output should be restricted to one row per subject.

Table 1:

Subject       pd_1               pd_2                 pd_3                      pd_4                 ... pd_36

100             21NOV2013   19DEC2013       16JAN2014        3FEB2014

Table 2:

Subject           Response_Date

100                    2013-11-21

100                    2013-12-19

100                    2014-01-16

100                    2014-02-14

100                    2014-03-13

Output:

Subject     pd_1     pd_2     pd_3     pd_4

100                                                  03Feb2014

6 REPLIES 6
Ksharp
Super User

OK. if you like Hash Table.

data Table1;
input Subject    (   pd_1               pd_2                 pd_3                      pd_4  ) (:date9.);
format pd_: date9.;
cards; 
100             21NOV2013   19DEC2013       16JAN2014        3FEB2014
;
run;
data Table2;
input Subject           Response_Date : yymmdd10.;
format Response_Date date9.;
cards;
100                    2013-11-21
100                    2013-12-19
100                    2014-01-16
100                    2014-02-14
100                    2014-03-13
;
run;
data want(drop=i Response_Date);
if _n_ eq 1 then do;
 if 0 then set table2;
 declare hash ha(dataset:'table2',hashexp:16);
  ha.definekey('Subject','Response_Date');
  ha.definedone();
end;
 set table1;
 array x{*} pd_:;
 do i=1 to dim(x);
  Response_Date=x{i}; 
  if ha.check()=0 then call missing(x{i});
 end;
run;


Xia Keshan

banoor
Calcite | Level 5

Thanks keshan. This was very helpful. I finally got what I expected. Also you have introduced me to new world of "Hash". This data step functionality is very versatile. Is there a possibility if I can add a window period of +/- 7 days to get the match.  If my response date is within a +/- 7 days of pd_ dates, then it should not give me a record.

Ksharp
Super User

Sure . That is possible .

data Table1;
input Subject    (   pd_1               pd_2                 pd_3                      pd_4  ) (:date9.);
format pd_: date9.;
cards; 
100             21NOV2013   19DEC2013       16JAN2014        3FEB2014
;
run;
data Table2;
input Subject           Response_Date : yymmdd10.;
format Response_Date date9.;
cards;
100                    2013-11-29
100                    2013-12-19
100                    2014-01-16
100                    2014-02-14
100                    2014-03-13
;
run;
data want(drop=i Response_Date date);
if _n_ eq 1 then do;
 if 0 then set table2;
 declare hash ha(dataset:'table2',hashexp:16);
  ha.definekey('Subject','Response_Date');
  ha.definedone();
end;
 set table1;
 array x{*} pd_:;
 do i=1 to dim(x);
  do date=x{i}-7 to x{i}+7      ;
   Response_Date=date; 
   if ha.check()=0 then call missing(x{i});
  end;
 end;
run;



Xia Keshan

sandra17
Calcite | Level 5

Hope it helps. Please dont mind if it is not the way you want. Thanks

banoor
Calcite | Level 5

Thanks sandra. I will try this out and get back to you.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

A third possibility:

proc transpose data=table2 out=temp;  /* assumes sorted */
  by subject;
  var response_date;
run;

data want (drop=i j col1-col5 _name_);
  merge table1 temp;
  by subject;
  array pd_{5};
  array col{5};
  do i=1 to 5;
    do j=1 to 5;
      if pd_{i}=col{j} then pd_{i}=.;
    end;
  end;
  format pd_1-pd_5 date9.;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

New Learning Events in April

 

Join us for two new fee-based courses: Administrative Healthcare Data and SAS via Live Web Monday-Thursday, April 24-27 from 1:00 to 4:30 PM ET each day. And Administrative Healthcare Data and SAS: Hands-On Programming Workshop via Live Web on Friday, April 28 from 9:00 AM to 5:00 PM ET.

LEARN MORE

Discussion stats
  • 6 replies
  • 3808 views
  • 6 likes
  • 4 in conversation