Hi, I have a large block of text that I am trying to extract sentences from. The sentences that I'm interested in extracting begin with the phrase "failed" and end with a period. However, sometimes the large text includes several instances of a failed to phrase and the code I'm using now does not capture each instance, but only the first. This is an example of the large block of text I am working with: Based on document review and interview, it was determined John failed to properly put away all materials used during construction. This could result in damage to the work place and possible injury to co-works. It was also noted that Dave failed to secure the ladder at the end of his shift. Additionally, Deborah failed to properly shut down her computer before leaving for the day. Below is the code I have been using. Another possible start phrase is "it was determined" and another possible end phrase is "Findings", but I'm really primarily concerned with extracting between "failed" and the first period. data test3; set test2; failed = index(text,'failed'); determined = index(text,'it was determined'); findings = index(text,'Findings'); if findings ne 0 then do; if failed ne 0 then do; tmp = substr(text,failed+0); put tmp; pos2 = index(tmp,"Findings"); Extract1 = substr(tmp,1,pos2-1); put Extract1; end; else if failed = 0 then do; tmp2 = substr(text,determined+0); put tmp2; pos4 = index(tmp2,"Findings"); Extract2 = substr(tmp2,1,pos4-1); put Extract2; end; end; if findings = 0 then do; if failed ne 0 then do; tmp = substr(text,failed+0); put tmp; pos2 = index(tmp,'.'); Extract1 = substr(tmp,1,pos2-1); put Extract1; end; else if failed = 0 then do; tmp2 = substr(text,determined+0); put tmp2; pos4 = index(tmp2,'.'); Extract2 = substr(tmp2,1,pos4-1); put Extract2; end; end; keep text Extract1 Extract2; run; Thank you for any help!
... View more