BookmarkSubscribeRSS Feed
sateh
Fluorite | Level 6

How could I create a concept that allows me to extract the first 15 words of each paragraph of a document in SAS VISUAL TEXT ANALYTICS

2 REPLIES 2
HarrySnart
SAS Employee

Hi @sateh ,

 

It would help if you gave some more information about the structure of the text. Depending on how complex the data is, this simple example may give you a useful starting point

 

HarrySnart_0-1674747409619.png

 

Thanks

Harry

HarrySnart
SAS Employee
Note here I'm using the LITI syntax to work with sentence only. When using the sandbox rules you can't use the PARA structure in the CONCEPT_RULE. If you have multiple sentences in each paragraph you may want to nest the sentence logic into the paragraph logic. Likewise, my simple example has no punctuation. You may want to pre-process the text to remove punctuation or write something like a REGEX rule