Extract a sentence containing a keyword

Accepted Solution Solved
Reply
Regular Contributor
Posts: 161
Accepted Solution

Extract a sentence containing a keyword

Dear All:

Just wondering how to extract a whole sentence out of a cell like the following when I only want to the sentence containing "expects" and "anticipates"?  In the following case, I only want the sentence "For the full 2014 year, the company expects to spend less than an aggregate of $1.0 million on maintenance capital expenditures and expansion capital expenditures."

Adjusted EBITDA attributable to partners was $8,537,000. Capital expenditures were $83,000 against $590,000 a year ago. Distributable cash flow attributable to partners was $8,342,000.

For the full 2014 year, the company expects to spend less than an aggregate of $1.0 million on maintenance capital expenditures and expansion capital expenditures.

Accepted Solutions
Solution
‎08-19-2014 11:21 AM
Super User
Posts: 10,047

Re: Extract a sentence containing a keyword

Posted in reply to caveman529

What does your real data look like ?

data have;
text='Adjusted EBITDA attributable to partners was $8,537,000. Capital expenditures were $83,000 against $590,000 a year ago. Distributable cash flow attributable to partners was $8,342,000. 
For the full 2014 year, the company expects to spend less than an aggregate of $1.0 million on maintenance capital expenditures and expansion capital expenditures.';
run;

data want;
 set have;
 pid=prxparse('/([^\.]|\.(?=\d))+\./o');
 start=1;stop=length(text);
 call prxnext(pid,start,stop,text,p,l);
 do while(p>0);
  found=substr(text,p,l);
  if find(found,'expects','i') then output;
  call prxnext(pid,start,stop,text,p,l);
 end;
run;

Xia Keshan

View solution in original post


All Replies
Solution
‎08-19-2014 11:21 AM
Super User
Posts: 10,047

Re: Extract a sentence containing a keyword

Posted in reply to caveman529

What does your real data look like ?

data have;
text='Adjusted EBITDA attributable to partners was $8,537,000. Capital expenditures were $83,000 against $590,000 a year ago. Distributable cash flow attributable to partners was $8,342,000. 
For the full 2014 year, the company expects to spend less than an aggregate of $1.0 million on maintenance capital expenditures and expansion capital expenditures.';
run;

data want;
 set have;
 pid=prxparse('/([^\.]|\.(?=\d))+\./o');
 start=1;stop=length(text);
 call prxnext(pid,start,stop,text,p,l);
 do while(p>0);
  found=substr(text,p,l);
  if find(found,'expects','i') then output;
  call prxnext(pid,start,stop,text,p,l);
 end;
run;

Xia Keshan

Regular Contributor
Posts: 161

Re: Extract a sentence containing a keyword

Hi, Ksharp:

Many thanks for your example.  I'll see if I can further develop based on that -

I was hoping to use SAS to make text mining a bit easier.   I have a variable called "text" that contains a lot of paragraphs.  But from the example you gave, regular expression is unavoidable. 

I have a list of hit words like "expect" "anticipate" "forecast".  When these keywords is present, I want to extract the sentence containing them for further processing. 

Is the SAS text miner more efficient on this matter?  

Super User
Posts: 10,047

Re: Extract a sentence containing a keyword

Posted in reply to caveman529

I don't know . I have never use SAS text miner yet.  and with my SAS skill, I think I don't need SAS text miner which would cost me lots of money . Just Kidding.  Smiley Happy

Xia Keshan

Regular Contributor
Posts: 161

Re: Extract a sentence containing a keyword

Seems like all I need to do is invest in the perl in SAS.  Thanks, Ksharp.  This works great!

Trusted Advisor
Posts: 1,231

Re: Extract a sentence containing a keyword

Posted in reply to caveman529

Hi,

Your required sentence contains only expects but condition requires both "expects" and "anticipates".

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 448 views
  • 0 likes
  • 3 in conversation