BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ksmit
Obsidian | Level 7

Hello,

I have a text string variable tha reads:

"PLEASE LOAD TABLE 1394, SAME RATE AS TABLE 276."

Like this variable, there are other instances of unstructured data in this field.

 

My working thought is to create keywords for the column to extract from the text.

 

In this example, TABLE is the keyword. I would like to extract "Table 1394" into one variable and "Table 276" into another.

I've worked with Index, Scan, and find with varied results, but I am finding myself getting further away from my desired result. 😞

 

if find(lowcase(txtstring),'table') >0 then table='table';
do i=1 to countw(txtstring," ");
if scan(lowcase(txtstring),i," ")=table then do;
after=strip(scan(txtstring,i+1," "));
end;
end;
output; end;

 

this produces only the first iteration of table:

table after
table 276

 

Any guidance is greatly appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Like this?

data HAVE;
  TEXT="PLEASE LOAD TABLE 1394, SAME RATE AS TABLE 276.  "; output;
  TEXT="PLEASE LOAD TABLE 139, SAME table 2  AS TABLE 27."; output;
data WANT;
  set HAVE;
  if _N_=1 then REGEX+prxparse('/(TABLE \d*)/i');
  START = 1;
  STOP  = length(TEXT);
  call prxnext(REGEX, START, STOP, TEXT, POS, LEN);
  do while (POS);
    STR   = substr(TEXT,POS,LEN);
    output;
    call prxnext(REGEX, START, STOP, TEXT, POS, LEN);
  end;
  keep STR;
run;
proc print data=WANT;
run;

Obs STR
1 TABLE 1394
2 TABLE 276
3 TABLE 139
4 table 2
5 TABLE 27

 

View solution in original post

4 REPLIES 4
ChrisNZ
Tourmaline | Level 20

Like this?

data HAVE;
  TEXT="PLEASE LOAD TABLE 1394, SAME RATE AS TABLE 276.  "; output;
  TEXT="PLEASE LOAD TABLE 139, SAME table 2  AS TABLE 27."; output;
data WANT;
  set HAVE;
  if _N_=1 then REGEX+prxparse('/(TABLE \d*)/i');
  START = 1;
  STOP  = length(TEXT);
  call prxnext(REGEX, START, STOP, TEXT, POS, LEN);
  do while (POS);
    STR   = substr(TEXT,POS,LEN);
    output;
    call prxnext(REGEX, START, STOP, TEXT, POS, LEN);
  end;
  keep STR;
run;
proc print data=WANT;
run;

Obs STR
1 TABLE 1394
2 TABLE 276
3 TABLE 139
4 table 2
5 TABLE 27

 

Ksmit
Obsidian | Level 7
Much appreciated Chris. This worked like a charm.

I am thinking about putting this in a do loop for as many keywords as I have (presently 6) this way if any of the words populate in each iteration it will create a counter column per keyword.

Thanks for starting me off.
ChrisNZ
Tourmaline | Level 20

You needn't do a DO loop.

If say you want to also capture CHAIR 99, you can use:

REGEX+prxparse('/(TABLE \d*|CHAIR \d*)/i');

 

Ksmit
Obsidian | Level 7
EVEN Better! Thanks for this follow up as it got me exactly what I needed.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 961 views
  • 1 like
  • 2 in conversation