Hello,
I have a text string variable tha reads:
"PLEASE LOAD TABLE 1394, SAME RATE AS TABLE 276."
Like this variable, there are other instances of unstructured data in this field.
My working thought is to create keywords for the column to extract from the text.
In this example, TABLE is the keyword. I would like to extract "Table 1394" into one variable and "Table 276" into another.
I've worked with Index, Scan, and find with varied results, but I am finding myself getting further away from my desired result. 😞
if find(lowcase(txtstring),'table') >0 then table='table';
do i=1 to countw(txtstring," ");
if scan(lowcase(txtstring),i," ")=table then do;
after=strip(scan(txtstring,i+1," "));
end;
end;
output; end;
this produces only the first iteration of table:
table after
table 276
Any guidance is greatly appreciated.
Like this?
data HAVE;
TEXT="PLEASE LOAD TABLE 1394, SAME RATE AS TABLE 276. "; output;
TEXT="PLEASE LOAD TABLE 139, SAME table 2 AS TABLE 27."; output;
data WANT;
set HAVE;
if _N_=1 then REGEX+prxparse('/(TABLE \d*)/i');
START = 1;
STOP = length(TEXT);
call prxnext(REGEX, START, STOP, TEXT, POS, LEN);
do while (POS);
STR = substr(TEXT,POS,LEN);
output;
call prxnext(REGEX, START, STOP, TEXT, POS, LEN);
end;
keep STR;
run;
proc print data=WANT;
run;
Obs | STR |
---|---|
1 | TABLE 1394 |
2 | TABLE 276 |
3 | TABLE 139 |
4 | table 2 |
5 | TABLE 27 |
Like this?
data HAVE;
TEXT="PLEASE LOAD TABLE 1394, SAME RATE AS TABLE 276. "; output;
TEXT="PLEASE LOAD TABLE 139, SAME table 2 AS TABLE 27."; output;
data WANT;
set HAVE;
if _N_=1 then REGEX+prxparse('/(TABLE \d*)/i');
START = 1;
STOP = length(TEXT);
call prxnext(REGEX, START, STOP, TEXT, POS, LEN);
do while (POS);
STR = substr(TEXT,POS,LEN);
output;
call prxnext(REGEX, START, STOP, TEXT, POS, LEN);
end;
keep STR;
run;
proc print data=WANT;
run;
Obs | STR |
---|---|
1 | TABLE 1394 |
2 | TABLE 276 |
3 | TABLE 139 |
4 | table 2 |
5 | TABLE 27 |
You needn't do a DO loop.
If say you want to also capture CHAIR 99, you can use:
REGEX+prxparse('/(TABLE \d*|CHAIR \d*)/i');
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.