05-09-2012 09:19 PM
I am trying to clean up a field of free text where our sales reps can go in and enter data on how their sales calls went, what they discussed, etc. I'm able to remove most of the garbage, but some of our reps first put their comments into Word and then copy into the text field. This brings over all kinds of rtf code that I want to omit from the data, and I can't think of a clean way of removing it all because sometimes it precedes their comments, sometimes it envelopes it, etc.
Does anyone know if there are functions that cover the rtf code?? Here is what I have so far:
WIthin SAS Text Miner, the Text Import node (or the %tmfilter macro) has the ability to strip out any RTF specific code and then output a copy of the document as a simple text file. I would try this method as the standard way to create a text document from a RTF document.