Hello,
I need some suggestions on how I can extract a date in nn/nn/yyyy format. I use 'nn' rather than mm/dd because the data is entered in such a way where sometimes its mm/dd/yyyy or dd/mm/yyyy etc... but date always presents in some format using back slashes. I think prx may be the only was to do this, but I'm not familiar with it at all.
For example:
'HAS COMPLETED THE TRIAL. DATE OF LAST SCAN SHOWING STABLE DISEASE- 30/10/2020'
or
TRIAL COMPLETED. LAST CT SCAN SHOWING STABLE DISEASE WAS 05/NOV/2020
or
COMPLETION OF 2 YEARS TREATMENT- CT 21/04/21 SHOWS STABLE DISEASE
I found this code that searches for dates embeded in text, and it worked great, but only output '1' ...for yes a date exisits. I feel like it could be modified to output an actual date to a new variablee, but not sure where to start there.
Any help is greatly appreciated!!
date_search =
prxparse("m*january|february|march|april|june|july|august|september|october
|november|december|jan\.|feb\.|mar\.|apr\.|may\.|jun\.|jul\.|aug\.|sep\.|oc
t\.|nov\.|dec\.|, 19|, 20|\d-
|\djan|\dfeb|\dmar|\dapr|\dmay|\djun|\djul|\daug|\dsep|\doct|\dnov|\ddec|ja
n,|feb,|mar,|apr,|may,|jun,|jul,|aug,|sep,|oct,|nov,|dec,|jan \d|feb \d|mar
\d|apr \d|may \d|jun \d|jul \d|aug \d|sep \d|oct \d|nov \d|dec
\d|jan\d|feb\d|mar\d|apr\d|may\d|jun\d|jul\d|aug\d|sep\d|oct\d|nov\d|dec\d|
\d/|\d\d \d\d \d\d|\d:\d|\d\.\d\d\.\d|\d\.\d\.\d|jan-|feb-|mar-|apr-|may-
|jun-|jul-|aug-|sep-|oct-|nov-|dec-*oi");
if prxmatch(date_search,EOTDTHR )>0 then output;
This is a non-prx way to get to some of this. Doesn't handle all cases as shown but easy enough to add conditions.
data have;
orig='HAS COMPLETED THE TRIAL. DATE OF LAST SCAN SHOWING STABLE DISEASE- 30/10/2020';
output;
orig='TRIAL COMPLETED. LAST CT SCAN SHOWING STABLE DISEASE WAS 05/NOV/2020';
output;
orig='COMPLETION OF 2 YEARS TREATMENT- CT 21/04/21 SHOWS STABLE DISEASE';
output;
run;
data want;
set have;
length found_date $20.;
retain date found_date;
date=.;
found_date='';
format date date9.;
do i=1 to countw(orig);
word=scan(orig, i, " ");
if find(word, "/")>0 then
do;
date=input(word, anydtdte.);
found_date=word;
end;
end;
drop i word;
run;
@jenim514 wrote:
Hello,
I need some suggestions on how I can extract a date in nn/nn/yyyy format. I use 'nn' rather than mm/dd because the data is entered in such a way where sometimes its mm/dd/yyyy or dd/mm/yyyy etc... but date always presents in some format using back slashes. I think prx may be the only was to do this, but I'm not familiar with it at all.
For example:
'HAS COMPLETED THE TRIAL. DATE OF LAST SCAN SHOWING STABLE DISEASE- 30/10/2020'
or
TRIAL COMPLETED. LAST CT SCAN SHOWING STABLE DISEASE WAS 05/NOV/2020
or
COMPLETION OF 2 YEARS TREATMENT- CT 21/04/21 SHOWS STABLE DISEASE
I found this code that searches for dates embeded in text, and it worked great, but only output '1' ...for yes a date exisits. I feel like it could be modified to output an actual date to a new variablee, but not sure where to start there.
Any help is greatly appreciated!!
date_search = prxparse("m*january|february|march|april|june|july|august|september|october |november|december|jan\.|feb\.|mar\.|apr\.|may\.|jun\.|jul\.|aug\.|sep\.|oc t\.|nov\.|dec\.|, 19|, 20|\d- |\djan|\dfeb|\dmar|\dapr|\dmay|\djun|\djul|\daug|\dsep|\doct|\dnov|\ddec|ja n,|feb,|mar,|apr,|may,|jun,|jul,|aug,|sep,|oct,|nov,|dec,|jan \d|feb \d|mar \d|apr \d|may \d|jun \d|jul \d|aug \d|sep \d|oct \d|nov \d|dec \d|jan\d|feb\d|mar\d|apr\d|may\d|jun\d|jul\d|aug\d|sep\d|oct\d|nov\d|dec\d| \d/|\d\d \d\d \d\d|\d:\d|\d\.\d\d\.\d|\d\.\d\.\d|jan-|feb-|mar-|apr-|may- |jun-|jul-|aug-|sep-|oct-|nov-|dec-*oi"); if prxmatch(date_search,EOTDTHR )>0 then output;
This is a non-prx way to get to some of this. Doesn't handle all cases as shown but easy enough to add conditions.
data have;
orig='HAS COMPLETED THE TRIAL. DATE OF LAST SCAN SHOWING STABLE DISEASE- 30/10/2020';
output;
orig='TRIAL COMPLETED. LAST CT SCAN SHOWING STABLE DISEASE WAS 05/NOV/2020';
output;
orig='COMPLETION OF 2 YEARS TREATMENT- CT 21/04/21 SHOWS STABLE DISEASE';
output;
run;
data want;
set have;
length found_date $20.;
retain date found_date;
date=.;
found_date='';
format date date9.;
do i=1 to countw(orig);
word=scan(orig, i, " ");
if find(word, "/")>0 then
do;
date=input(word, anydtdte.);
found_date=word;
end;
end;
drop i word;
run;
@jenim514 wrote:
Hello,
I need some suggestions on how I can extract a date in nn/nn/yyyy format. I use 'nn' rather than mm/dd because the data is entered in such a way where sometimes its mm/dd/yyyy or dd/mm/yyyy etc... but date always presents in some format using back slashes. I think prx may be the only was to do this, but I'm not familiar with it at all.
For example:
'HAS COMPLETED THE TRIAL. DATE OF LAST SCAN SHOWING STABLE DISEASE- 30/10/2020'
or
TRIAL COMPLETED. LAST CT SCAN SHOWING STABLE DISEASE WAS 05/NOV/2020
or
COMPLETION OF 2 YEARS TREATMENT- CT 21/04/21 SHOWS STABLE DISEASE
I found this code that searches for dates embeded in text, and it worked great, but only output '1' ...for yes a date exisits. I feel like it could be modified to output an actual date to a new variablee, but not sure where to start there.
Any help is greatly appreciated!!
date_search = prxparse("m*january|february|march|april|june|july|august|september|october |november|december|jan\.|feb\.|mar\.|apr\.|may\.|jun\.|jul\.|aug\.|sep\.|oc t\.|nov\.|dec\.|, 19|, 20|\d- |\djan|\dfeb|\dmar|\dapr|\dmay|\djun|\djul|\daug|\dsep|\doct|\dnov|\ddec|ja n,|feb,|mar,|apr,|may,|jun,|jul,|aug,|sep,|oct,|nov,|dec,|jan \d|feb \d|mar \d|apr \d|may \d|jun \d|jul \d|aug \d|sep \d|oct \d|nov \d|dec \d|jan\d|feb\d|mar\d|apr\d|may\d|jun\d|jul\d|aug\d|sep\d|oct\d|nov\d|dec\d| \d/|\d\d \d\d \d\d|\d:\d|\d\.\d\d\.\d|\d\.\d\.\d|jan-|feb-|mar-|apr-|may- |jun-|jul-|aug-|sep-|oct-|nov-|dec-*oi"); if prxmatch(date_search,EOTDTHR )>0 then output;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.