Text mining and content categorization

extract paragraphs based on certain key words from word dcoument

Reply
New Contributor
Posts: 2

extract paragraphs based on certain key words from word dcoument

Hi,

 

I have a word document with each heading higlighted in bold and underlined which is of 100 pages, I want to extract the paragraphs from the document based on certain keywords along with the heading.

Frequent Contributor
Posts: 130

Re: extract paragraphs based on certain key words from word dcoument

break the single document up into several hundred document files in a folder, each with just one paragraph per document, then import the corpus with the SAS Enterprise Miner text miner text import node, then process the corpus with a text topic node, creating one of more user defined topics based on your keywords. The exported SAS dataset will have a row for each paragraph with an interval measure topic score and a binary presence/absence for each topic added.
Ask a Question
Discussion stats
  • 1 reply
  • 388 views
  • 0 likes
  • 2 in conversation