DATA Step, Macro, Functions and more

How to extract the correct information from the rtf text

Reply
Contributor
Posts: 32

How to extract the correct information from the rtf text

I have a text variable like the following. How can I extract the useful information that I need? Thanks.

 

{\rtf1\ansi\ansicpg1252\uc0\deff1{\fonttbl {\f0\fswiss\fcharset0\fprq2 Arial;} {\f1\froman\fcharset0\fprq2 Times New Roman;} {\f2\froman\fcharset2\fprq2 Symbol;}} {\colortbl;\red0\green0\blue0;\red255\green255\blue255;\red255\green0\blue0;} {\stylesheet{\s0\f1\fs20 Normal;}{\*\cs10\additive Default Paragraph Font;}{\s16\f0\fs24 [Normal];}{\s17\sa120\f1\fs20\sbasedon0 Body Text;}{\s18\qc\sb360\sa240\f0\fs20\b\sbasedon0\snext0 heading 1;}} {\*\listtable{\list\listtemplateid1\listsimple{\listlevel\levelnfc0\leveljc0\levelfollow0\levelstartat1\levelspace0\levelindent0{\leveltext\'02\'00\'2e;}{\levelnumbers\'01;}\f1\fs24\b0\i0\strike0\ul0\cf0\cb0\fi-360\li821}{\listname ;}\listid1}} {\*\listoverridetable{\listoverride\listid1\listoverridecount0\ls1}} {\*\generator TX_RTF32 9.1.312.500;} \deftab1134\paperw12240\paperh15840\margl2592\margt3888\margr1138\margb1138\pard\s17\sl480\slmult1\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\tx9360\tx10080\plain\f1\fs24\cf3 <CERTIFIED MAIL #> <HAND DELIVERY> \par\pard\s17\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\tx9360\tx10080\plain\f1\fs24 [Today in Words()]\par\par [Admin. 1st Name (Frank)()] [Admin. Last name (Smith)()], Administrator\parhe facility received an \loch\f1\hich\f1 \'93A\'94 during its survey,...

 

Super User
Posts: 11,343

Re: How to extract the correct information from the rtf text

[ Edited ]

If you mean something other than the RTF codes best might be to save the file as plain Text and remove all of the formatting and appearance information.

Then read the resulting text file.

Contributor
Posts: 32

Re: How to extract the correct information from the rtf text

Thanks. I saved it in .rtf (word document). However, I do not know how to remove those strange symbols and letters. When I open the file, those rtf codes are still there. Any suggestions will greatly appreciated. 

Super User
Posts: 19,877

Re: How to extract the correct information from the rtf text

Save as txt instead of RTF. 

Super User
Posts: 11,343

Re: How to extract the correct information from the rtf text

RTF is not plain text as you see.

Contributor
Posts: 32

Re: How to extract the correct information from the rtf text

Thanks. I tried to save it as .txt file. When I open it again, it shows extactly what I save there. I am wondering do I need some code to extract the right informaiton, or anyting else? 

Super User
Posts: 19,877

Re: How to extract the correct information from the rtf text

You opened rhe file in Word and then saved as a txt file? That should have stripped the RTF tags. 

Ask a Question
Discussion stats
  • 6 replies
  • 249 views
  • 0 likes
  • 3 in conversation