- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello, everyone
I have four variables start with {rtf1\.... symbols/text that I need to find some key words from there to generate a report. The contents include such as:
"{\rtf1\ansi\deff0\deftab720{\fonttbl{\f0\fswiss MS Sans Serif;}{\f1\froman\fcharset2 Symbol;}{\f2\froman Times New Roman;}{\f3\froman\fprq2 Times New Roman;}{\f4\fswiss MS Shell Dlg;}{\f5\froman Times New Roman;}{\f6\fswiss\fprq2 System;}}
{\colortbl\red0\green0\blue0;\red255\green0\blue0;}
\deflang1033\pard\plain\f5\fs20
}"
"{\rtf1\ansi\ansicpg1252\deff0\deftab720{\fonttbl{\f0\fswiss MS Sans Serif;}{\f1\froman\fcharset2 Symbol;}{\f2\froman Times New Roman;}{\f3\froman Times New Roman;}{\f4\fswiss\fprq2 System;}{\f5\froman\fprq2 Times New Roman;}}
{\colortbl\red0\green0\blue0;}
\deflang1033\pard\plain\f3\fs20
}"
"{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0 Times New Roman;}}
\viewkind4\uc1\pard\f0\fs20\par
}"
I do not know what are these symbols/text mean. I need to convert these text/symbols to plain text, so that I can search for the key words that I need. Any suggestions/hints will be very appreciated! Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Looks like you've read in some Word RTF document. How did you get there in first place?
I would try to first extract the text only with non-SAS tools and only then use SAS for further processing. How to do this depends on your environment.
You could for example use a VB script for extracting the text or also Tika does a really great job. https://tika.apache.org/download.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello, Patrick
I just use SAS with ODBC connecttion to get the data (it's Oracle database). My connection code showing below:
libname
exports
Oracle
path = XXX
dbprompt = no
uid=&username.
Password=&pswd.
schema = XXX
;
I tried to connect the data with excel, access and I got all the same text messages. I will check the link that you provided here soon. Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Oh... I see. So that's stored in a CLOB in Oracle. That's gonna be tricky.
I've never been in your situation so can't speak out of experience. Just throwing some thoughts:
- Everything I've proposed in my last post assumed that you have direct access to the RTF document as a file; but that's not the case
- You would need to read the CLOB into multiple rows in SAS as a SAS variable can only hold 32KB. It's possible to do but needs some extra coding.
- There must be a reason that someone stores the RTF's in Oracle. If you're just after something like number of hits for a search term then may be there is Oracle Text available and you could run your queries in-database and then just get the result back. I've never used Oracle Text so not sure how and if this could be called out of a remote SAS process.
What I would try first:
Make things work directly in-database (using SQL developer; using Oracle Text). Only once things work try and call it out of a SAS session.