Hi there,
I have a column containing really long test and I am struggling to extract the date portion in the string. The date format is yyyymmdd.
An example of the string:
Schoolrecords_class1B_20190809_2209.doc
The character strings vary and I therefore cannot specify a position of the string.
But the format is always 8 digits, correct?
Ok. Try this
data want;
string = 'Schoolrecords_class1B_20190809_2209.doc';
c = prxchange('s/.*_(\d{8})_.*/$1/', -1, string);
date = input(c, yymmdd8.);
format date yymmdd10.;
run;
Small change
data want;
string = 'TST_NoRecs_BBBGRADE12BEFE20210701_220019.doc';
c = prxchange('s/.*(\d{8}).*/$1/', -1, string);
date = input(c, yymmdd8.);
format date yymmdd10.;
run;
Changed the regular expression, so it doesn't catch 12345678 etc.
data want;
string = 'TST_NoRecs_ BBBGRADE22010508_24-T123456789.doc';
c = prxchange('s/.*([12]\d{3}[01]\d[0123]\d).*/$1/', -1, string);
date = input(c, yymmdd8.);
format date yymmdd10.;
run;
Glad to help 🙂
Can you describe the exact logic of the string / date you want to extract?
With scan function you can pick third "word" from original string.
data want;
set have;
Attrib Datevar format=yymmdd10.;
dateVar = input(Scan(DocName,3,'_'),yymmdd10.);
run;
data have;
input have $80.;
pid=prxparse('/\d{8}/');
call prxsubstr(pid,have,p,l);
if p then want=substr(have,p,l);
drop pid p l;
cards;
Schoolrecords_class1B_20190809_2209.doc
;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.