This is a non-prx way to get to some of this. Doesn't handle all cases as shown but easy enough to add conditions.
data have;
orig='HAS COMPLETED THE TRIAL. DATE OF LAST SCAN SHOWING STABLE DISEASE- 30/10/2020';
output;
orig='TRIAL COMPLETED. LAST CT SCAN SHOWING STABLE DISEASE WAS 05/NOV/2020';
output;
orig='COMPLETION OF 2 YEARS TREATMENT- CT 21/04/21 SHOWS STABLE DISEASE';
output;
run;
data want;
set have;
length found_date $20.;
retain date found_date;
date=.;
found_date='';
format date date9.;
do i=1 to countw(orig);
word=scan(orig, i, " ");
if find(word, "/")>0 then
do;
date=input(word, anydtdte.);
found_date=word;
end;
end;
drop i word;
run;
@jenim514 wrote:
Hello,
I need some suggestions on how I can extract a date in nn/nn/yyyy format. I use 'nn' rather than mm/dd because the data is entered in such a way where sometimes its mm/dd/yyyy or dd/mm/yyyy etc... but date always presents in some format using back slashes. I think prx may be the only was to do this, but I'm not familiar with it at all.
For example:
'HAS COMPLETED THE TRIAL. DATE OF LAST SCAN SHOWING STABLE DISEASE- 30/10/2020'
or
TRIAL COMPLETED. LAST CT SCAN SHOWING STABLE DISEASE WAS 05/NOV/2020
or
COMPLETION OF 2 YEARS TREATMENT- CT 21/04/21 SHOWS STABLE DISEASE
I found this code that searches for dates embeded in text, and it worked great, but only output '1' ...for yes a date exisits. I feel like it could be modified to output an actual date to a new variablee, but not sure where to start there.
Any help is greatly appreciated!!
date_search =
prxparse("m*january|february|march|april|june|july|august|september|october
|november|december|jan\.|feb\.|mar\.|apr\.|may\.|jun\.|jul\.|aug\.|sep\.|oc
t\.|nov\.|dec\.|, 19|, 20|\d-
|\djan|\dfeb|\dmar|\dapr|\dmay|\djun|\djul|\daug|\dsep|\doct|\dnov|\ddec|ja
n,|feb,|mar,|apr,|may,|jun,|jul,|aug,|sep,|oct,|nov,|dec,|jan \d|feb \d|mar
\d|apr \d|may \d|jun \d|jul \d|aug \d|sep \d|oct \d|nov \d|dec
\d|jan\d|feb\d|mar\d|apr\d|may\d|jun\d|jul\d|aug\d|sep\d|oct\d|nov\d|dec\d|
\d/|\d\d \d\d \d\d|\d:\d|\d\.\d\d\.\d|\d\.\d\.\d|jan-|feb-|mar-|apr-|may-
|jun-|jul-|aug-|sep-|oct-|nov-|dec-*oi");
if prxmatch(date_search,EOTDTHR )>0 then output;
... View more