<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to remove RTF-formatting in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474414#M121902</link>
    <description>&lt;P&gt;You really are not going to get anywhere with this I am afraid.&amp;nbsp; There are no simple methods to parsing an rtf file into something usable.&amp;nbsp; Many have tried, I have tried, all have got some ways and given up.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The file itself is a markup language, and there are loads of tags:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://www.microsoft.com/en-us/download/details.aspx?id=10725" target="_blank"&gt;https://www.microsoft.com/en-us/download/details.aspx?id=10725&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;That is the latest spec.&amp;nbsp; Now you could write a parser, take each tag, find closing tag (if there is one), and perl may help somewhat.&amp;nbsp; But it is a big undertaking.&amp;nbsp; Even output generated from SAS which is pretty low in terms of rtf, can be very different between different systems and such like.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have also looked as well at what&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/11562"&gt;@Kurt_Bremser&lt;/a&gt;&amp;nbsp;mentioned, using another program to convert to another file format.&amp;nbsp; And there are ways to get html, or text output.&amp;nbsp; However even that, unless its a very simple file, really isn't much help.&amp;nbsp; Tabular output for instance - which a lot of SAS output is - doesn't have any indication of position.&amp;nbsp; RTF is literally one page at a time, cell by cell.&amp;nbsp; So you first need to parse the header blocks, then do page by page, extract the information, then set it all together.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I would go back to the source data, it is the best, and with limited time, the only feasible method.&lt;/P&gt;</description>
    <pubDate>Fri, 29 Jun 2018 12:38:30 GMT</pubDate>
    <dc:creator>RW9</dc:creator>
    <dc:date>2018-06-29T12:38:30Z</dc:date>
    <item>
      <title>How to remove RTF-formatting</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474398#M121892</link>
      <description>&lt;P&gt;Hi all&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am working on SAS 9.4 M3 (SYSVLONG4 = 9.04.01M3P06242015) and have encountered input data formatted with RTF formatting.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It has quite a few structures as it can origin through several different channels. So there are for me to see no quick and dirty fixes.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My data can look something like this:&lt;/P&gt;
&lt;P&gt;{\rtf1\fbidis\ansi\deff0{\fonttbl{\f0\fswiss\fprq2\fcharset0 Arial;}{\f1\fswiss\fprq2\fcharset0 Calibri;}}&lt;BR /&gt;{\colortbl ;\red0\green0\blue0;}&lt;BR /&gt;\viewkind4\uc1 d\ltrpar\cf1\lang1030\f0\fs22 *REAL TEXT* &lt;BR /&gt; d\ltrpar\sa200\sl276\slmult1\cf0\f1 *REAL TEXT*&lt;BR /&gt; d\ltrpar\cf1\f0 &lt;BR /&gt;}&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Where *REAL TEXT* indicates what I am really interested in.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Are any of you familiar with a SAS function (eg user written) or SAS MACRO that can actually do this stripping of RTF-formating?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best,&lt;/P&gt;
&lt;P&gt;Sander Ehmsen, Denmark.&lt;/P&gt;</description>
      <pubDate>Fri, 29 Jun 2018 11:12:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474398#M121892</guid>
      <dc:creator>SanderEhmsen</dc:creator>
      <dc:date>2018-06-29T11:12:51Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove RTF-formatting</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474399#M121893</link>
      <description>&lt;P&gt;RTF is an output format, i would refuse to write a program that parses RTF.&lt;/P&gt;
&lt;P&gt;It could be possible to remove the formatting by using some regular expressions, but i don't know enough about rtf to suggest something that actually does that job.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Does the interesting part always start after the first blank?&lt;/P&gt;</description>
      <pubDate>Fri, 29 Jun 2018 11:23:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474399#M121893</guid>
      <dc:creator>andreas_lds</dc:creator>
      <dc:date>2018-06-29T11:23:00Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove RTF-formatting</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474403#M121895</link>
      <description>&lt;P&gt;Depending on your environment, use VBA (Windows with MS Office) or shell scripting with OpenOffice (all open platforms) to load the rtf and save it as .txt.&lt;/P&gt;</description>
      <pubDate>Fri, 29 Jun 2018 11:39:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474403#M121895</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2018-06-29T11:39:37Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove RTF-formatting</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474414#M121902</link>
      <description>&lt;P&gt;You really are not going to get anywhere with this I am afraid.&amp;nbsp; There are no simple methods to parsing an rtf file into something usable.&amp;nbsp; Many have tried, I have tried, all have got some ways and given up.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The file itself is a markup language, and there are loads of tags:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://www.microsoft.com/en-us/download/details.aspx?id=10725" target="_blank"&gt;https://www.microsoft.com/en-us/download/details.aspx?id=10725&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;That is the latest spec.&amp;nbsp; Now you could write a parser, take each tag, find closing tag (if there is one), and perl may help somewhat.&amp;nbsp; But it is a big undertaking.&amp;nbsp; Even output generated from SAS which is pretty low in terms of rtf, can be very different between different systems and such like.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have also looked as well at what&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/11562"&gt;@Kurt_Bremser&lt;/a&gt;&amp;nbsp;mentioned, using another program to convert to another file format.&amp;nbsp; And there are ways to get html, or text output.&amp;nbsp; However even that, unless its a very simple file, really isn't much help.&amp;nbsp; Tabular output for instance - which a lot of SAS output is - doesn't have any indication of position.&amp;nbsp; RTF is literally one page at a time, cell by cell.&amp;nbsp; So you first need to parse the header blocks, then do page by page, extract the information, then set it all together.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I would go back to the source data, it is the best, and with limited time, the only feasible method.&lt;/P&gt;</description>
      <pubDate>Fri, 29 Jun 2018 12:38:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474414#M121902</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2018-06-29T12:38:30Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove RTF-formatting</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474820#M122059</link>
      <description>&lt;P&gt;Thank you for your suggestion.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Our SAS soon runs on a Linux platform. And my Data Custodians has refused to implement a RTF-parser on that platform.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So according to them it is not feasible.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Sander.&lt;/P&gt;</description>
      <pubDate>Mon, 02 Jul 2018 06:34:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474820#M122059</guid>
      <dc:creator>SanderEhmsen</dc:creator>
      <dc:date>2018-07-02T06:34:50Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove RTF-formatting</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474821#M122060</link>
      <description>Refusing is probably not an acceptable solution here. I have tried to find manually find patterns in the RTF-code like finding the first blank. I can get something like 90% right by this method. But the last 10% ends up miserably :-).</description>
      <pubDate>Mon, 02 Jul 2018 06:36:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474821#M122060</guid>
      <dc:creator>SanderEhmsen</dc:creator>
      <dc:date>2018-07-02T06:36:30Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove RTF-formatting</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474822#M122061</link>
      <description>Thank you very much for your reply. &lt;BR /&gt;&lt;BR /&gt;I have contacted my data provider. And maybe they can strip it in their end. &lt;BR /&gt;&lt;BR /&gt;I might do some tranwrd() and remove the most common RTF code. It will not get all the way. But it might be better for my end users.</description>
      <pubDate>Mon, 02 Jul 2018 06:41:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474822#M122061</guid>
      <dc:creator>SanderEhmsen</dc:creator>
      <dc:date>2018-07-02T06:41:33Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove RTF-formatting</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474823#M122062</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/2293"&gt;@SanderEhmsen&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Thank you for your suggestion.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Our SAS soon runs on a Linux platform. And my Data Custodians has refused to implement a RTF-parser on that platform.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So according to them it is not feasible.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Sander.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Tell them to look up "Mordac the Preventer".&lt;/P&gt;</description>
      <pubDate>Mon, 02 Jul 2018 06:49:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474823#M122062</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2018-07-02T06:49:35Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove RTF-formatting</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474829#M122063</link>
      <description>&lt;P&gt;Yes, they must have some raw data they used to generate the RTF, so that is the best method.&lt;/P&gt;</description>
      <pubDate>Mon, 02 Jul 2018 07:27:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-remove-RTF-formatting/m-p/474829#M122063</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2018-07-02T07:27:08Z</dc:date>
    </item>
  </channel>
</rss>

