Extract a sentence containing a keyword

Solved
Regular Contributor
Posts: 161

Extract a sentence containing a keyword

Dear All:

Just wondering how to extract a whole sentence out of a cell like the following when I only want to the sentence containing "expects" and "anticipates"?  In the following case, I only want the sentence "For the full 2014 year, the company expects to spend less than an aggregate of \$1.0 million on maintenance capital expenditures and expansion capital expenditures."

 Adjusted EBITDA attributable to partners was \$8,537,000. Capital expenditures were \$83,000 against \$590,000 a year ago. Distributable cash flow attributable to partners was \$8,342,000. For the full 2014 year, the company expects to spend less than an aggregate of \$1.0 million on maintenance capital expenditures and expansion capital expenditures.

Accepted Solutions
Solution
‎08-19-2014 11:21 AM
Super User
Posts: 10,770

Re: Extract a sentence containing a keyword

What does your real data look like ?

```data have;
text='Adjusted EBITDA attributable to partners was \$8,537,000. Capital expenditures were \$83,000 against \$590,000 a year ago. Distributable cash flow attributable to partners was \$8,342,000.
For the full 2014 year, the company expects to spend less than an aggregate of \$1.0 million on maintenance capital expenditures and expansion capital expenditures.';
run;

data want;
set have;
pid=prxparse('/([^\.]|\.(?=\d))+\./o');
start=1;stop=length(text);
call prxnext(pid,start,stop,text,p,l);
do while(p>0);
found=substr(text,p,l);
if find(found,'expects','i') then output;
call prxnext(pid,start,stop,text,p,l);
end;
run;

```

Xia Keshan

All Replies
Solution
‎08-19-2014 11:21 AM
Super User
Posts: 10,770

Re: Extract a sentence containing a keyword

What does your real data look like ?

```data have;
text='Adjusted EBITDA attributable to partners was \$8,537,000. Capital expenditures were \$83,000 against \$590,000 a year ago. Distributable cash flow attributable to partners was \$8,342,000.
For the full 2014 year, the company expects to spend less than an aggregate of \$1.0 million on maintenance capital expenditures and expansion capital expenditures.';
run;

data want;
set have;
pid=prxparse('/([^\.]|\.(?=\d))+\./o');
start=1;stop=length(text);
call prxnext(pid,start,stop,text,p,l);
do while(p>0);
found=substr(text,p,l);
if find(found,'expects','i') then output;
call prxnext(pid,start,stop,text,p,l);
end;
run;

```

Xia Keshan

Regular Contributor
Posts: 161

Re: Extract a sentence containing a keyword

Hi, Ksharp:

Many thanks for your example.  I'll see if I can further develop based on that -

I was hoping to use SAS to make text mining a bit easier.   I have a variable called "text" that contains a lot of paragraphs.  But from the example you gave, regular expression is unavoidable.

I have a list of hit words like "expect" "anticipate" "forecast".  When these keywords is present, I want to extract the sentence containing them for further processing.

Is the SAS text miner more efficient on this matter?

Super User
Posts: 10,770

Re: Extract a sentence containing a keyword

I don't know . I have never use SAS text miner yet.  and with my SAS skill, I think I don't need SAS text miner which would cost me lots of money . Just Kidding.

Xia Keshan

Regular Contributor
Posts: 161

Re: Extract a sentence containing a keyword

Seems like all I need to do is invest in the perl in SAS.  Thanks, Ksharp.  This works great!

Posts: 1,270

Re: Extract a sentence containing a keyword

Hi,

Your required sentence contains only expects but condition requires both "expects" and "anticipates".

🔒 This topic is solved and locked.

Discussion stats
• 5 replies
• 609 views
• 0 likes
• 3 in conversation