BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sam369
Obsidian | Level 7


Hi ALL,

I have Qucik Question, Like i  have final output in RTF. So what i want to do is retrive that data into sas data set?

For Example I have a data in My Rtf Is like this

   ATCTERM/PTERMS                                                                                   trt1              trt2                total 

Subjects with >=1 Concomitant Med                                                        100 ( 99.6)      900 ( 98.8)     1000 ( 99.2)                    

                                                                                                                                         

3-OXOANDROSTEN (4) DERIVATIVES                                                        8 (  0.8)        1 (  0.2)             9 (  0.5)                 TESTOSTERONE                                                                                      4 (  0.8)        7 (  0.2)              5 (  0.5)                      

                                                                                                                                         

ACE INHIBITORS AND CALCIUM CHANNEL BLOCKERS                            7(  1.2)        2 (  0.4)              9 (  0.8)                  

   AMLODIPINE BESYLATE W/BENAZEPRIL HYDROCHLOR.                     9 (  1.2)        2 (  0.4)              8 (  0.8)                 

                                                                                                                                         

ACE INHIBITORS AND DIURETICS                                                            63(  1.2)        0 (  0.0)               63 (  0.6)                   

   BI PREDONIUM                                                                                    2 7(  0.4)        0 (  0.0)                2 (  0.2)                     

   CIBADREX                                                                                           10(  0.2)        0 (  0.0)                 1 (  0.1)                      

   SALUTEC                                                                                            14 (  0.2)        0 (  0.0)                 1 (  0.1)                      

   ZESTORETIC                                                                                         2 (  0.4)        0 (  0.0)                 2 (  0.2)                     

                                                                                                                                         

ACE INHIBITORS, PLAIN                                                                      104 ( 20.6)      115 ( 22.7)           219 ( 21.6)                      

   BENAZEPRIL                                                                                       8 (  1.6)        6 (  1.2)                14 (  1.4)                      

   CAPTOPRIL                                                                                         5 (  1.0)        2 (  0.4)                  7 (  0.7)                       

   CILAZAPRIL                                                                                         1 (  0.2)        0 (  0.0)                  1 (  0.1)                       

   ENALAPRIL                                                                                       20 (  4.0)       32 (  6.3)                52 (  5.1)     

Thanks in Advance to you All

Thanks

Sam

1 ACCEPTED SOLUTION

Accepted Solutions
MikeZdeb
Rhodochrosite | Level 12

hi ... Tom's suggestion is a very good one

if you want to leave it as an RTF file, you could try the following (you did not give much guidance as to what you meant about the contents of the eventual data set, so I made some guesses using the attached RTF file) ... since I know next to nothing about the PRX functions, there are likely less cumbersome ways to extract the data ...


* use DDE to place the RTF file in the clipboard;

filename word DDE 'winword|system' notab;

data _null_;

file word;

put '[FileOpen .Name = "' "z:\test.rtf" '"]';

put "[EditSelectAll]";

put "[EditCopy]";

put '[FileClose]';

run;

filename x clipbrd;

* read clipboard contents with a data step;

data x;

length info $50;

rc = prxparse("/ \d+/");

infile x end=done;

do until(done);

  input;

  if find(_infile_,'(' ) then do;

  trt1a = input(left(scan(_infile_,-6,'() ')),12.);

  trt1b = input(left(scan(_infile_,-5,'() ')),12.);

  trt2a = input(left(scan(_infile_,-4,'() ')),12.);

  trt2b = input(left(scan(_infile_,-3,'() ')),12.);

  tota  = input(left(scan(_infile_,-2,'() ')),12.);

  totb  = input(left(scan(_infile_,-1,'() ')),12.);

  call prxsubstr(rc,_infile_,start);

  info = left(substrn(_infile_,1,start));

  output;

  end;

end;

drop start rc;

run;


filename x clear;


info                                            trt1a    trt1b    trt2a    trt2b    tota    totb

Subjects with >=1 Concomitant Med                100      99.6    900      98.8    1000    99.2

3-OXOANDROSTEN (4) DERIVATIVES                    8      0.8      1      0.2      9    0.5

TESTOSTERONE                                      4      0.8      7      0.2      5    0.5

ACE INHIBITORS AND CALCIUM CHANNEL BLOCKERS        7      1.2      2      0.4      9    0.8

AMLODIPINE BESYLATE W/BENAZEPRIL HYDROCHLOR.      9      1.2      2      0.4      8    0.8

ACE INHIBITORS AND DIURETICS                      63      1.2      0      0.0      63    0.6

BI PREDONIUM                                      27      0.4      0      0.0      2    0.2

CIBADREX                                          10      0.2      0      0.0      1    0.1

SALUTEC                                          14      0.2      0      0.0      1    0.1

ZESTORETIC                                        2      0.4      0      0.0      2    0.2

ACE INHIBITORS, PLAIN                            104      20.6    115      22.7    219    21.6

BENAZEPRIL                                        8      1.6      6      1.2      14    1.4

CAPTOPRIL                                          5      1.0      2      0.4      7    0.7

CILAZAPRIL                                        1      0.2      0      0.0      1    0.1

ENALAPRIL                                        20      4.0      32      6.3      52    5.1


View solution in original post

13 REPLIES 13
Tom
Super User Tom
Super User

Save the RTF file as a normal text file then read the data using a DATA step.

Cynthia_sas
SAS Super FREQ

Hi:

Tom is right, unless you REALLY understand the RTF control strings, it will be almost impossible to read the RTF file directly with SAS. My other idea is to open the RTF file in Word, then cut the table from Word and paste into Excel and then save as an Excel table  and read the Excel version of the table into SAS.

However, in another career, I worked for lawyers. Let me share with you what one of my attorneys would have said. That file is not the original source data. That file is a constructed REPORT based on detailed data. If you read the REPORT into a SAS dataset, then you have saved the report information into a SAS file -- but you could not go to court and show EXACTLY how you calculated the summary report from the detail data -- you could only say, I had this report -- I do not have the original source data. Then, everything you did, moving forward, from what you read out of the RTF file is suspect, because you can never replicate that original report from the original data. You can never answer a question like: How would the results be different if you eliminated the people with asthma from the TRT1 group. You don't know from your RTF file who actually might have had asthma in the TRT1 group.

Just another perspective.

cynthia

MikeZdeb
Rhodochrosite | Level 12

hi ... Tom's suggestion is a very good one

if you want to leave it as an RTF file, you could try the following (you did not give much guidance as to what you meant about the contents of the eventual data set, so I made some guesses using the attached RTF file) ... since I know next to nothing about the PRX functions, there are likely less cumbersome ways to extract the data ...


* use DDE to place the RTF file in the clipboard;

filename word DDE 'winword|system' notab;

data _null_;

file word;

put '[FileOpen .Name = "' "z:\test.rtf" '"]';

put "[EditSelectAll]";

put "[EditCopy]";

put '[FileClose]';

run;

filename x clipbrd;

* read clipboard contents with a data step;

data x;

length info $50;

rc = prxparse("/ \d+/");

infile x end=done;

do until(done);

  input;

  if find(_infile_,'(' ) then do;

  trt1a = input(left(scan(_infile_,-6,'() ')),12.);

  trt1b = input(left(scan(_infile_,-5,'() ')),12.);

  trt2a = input(left(scan(_infile_,-4,'() ')),12.);

  trt2b = input(left(scan(_infile_,-3,'() ')),12.);

  tota  = input(left(scan(_infile_,-2,'() ')),12.);

  totb  = input(left(scan(_infile_,-1,'() ')),12.);

  call prxsubstr(rc,_infile_,start);

  info = left(substrn(_infile_,1,start));

  output;

  end;

end;

drop start rc;

run;


filename x clear;


info                                            trt1a    trt1b    trt2a    trt2b    tota    totb

Subjects with >=1 Concomitant Med                100      99.6    900      98.8    1000    99.2

3-OXOANDROSTEN (4) DERIVATIVES                    8      0.8      1      0.2      9    0.5

TESTOSTERONE                                      4      0.8      7      0.2      5    0.5

ACE INHIBITORS AND CALCIUM CHANNEL BLOCKERS        7      1.2      2      0.4      9    0.8

AMLODIPINE BESYLATE W/BENAZEPRIL HYDROCHLOR.      9      1.2      2      0.4      8    0.8

ACE INHIBITORS AND DIURETICS                      63      1.2      0      0.0      63    0.6

BI PREDONIUM                                      27      0.4      0      0.0      2    0.2

CIBADREX                                          10      0.2      0      0.0      1    0.1

SALUTEC                                          14      0.2      0      0.0      1    0.1

ZESTORETIC                                        2      0.4      0      0.0      2    0.2

ACE INHIBITORS, PLAIN                            104      20.6    115      22.7    219    21.6

BENAZEPRIL                                        8      1.6      6      1.2      14    1.4

CAPTOPRIL                                          5      1.0      2      0.4      7    0.7

CILAZAPRIL                                        1      0.2      0      0.0      1    0.1

ENALAPRIL                                        20      4.0      32      6.3      52    5.1


PGStats
Opal | Level 21

Hi Mike, This approach is new to me. I guess the idea is to get Word to do the translation from RFT to text for you as one of the versions that it will save on the clipboard. Then SAS CLIPBRD engine accesses the text version. Right?

PG

PG
MikeZdeb
Rhodochrosite | Level 12

Hi ... that's correct, figured it was an easy way to do RTF-2-text. Once it's text, then you get to use INPUT; and the contents of _INFILE_.  I have no idea as to how efficient this is or what  the limitations of SELECT/COPY might be (but puzzle solving is fun).

PGStats
Opal | Level 21

Clever as always! Thanks for sharing. - PG

PG
Ksharp
Super User

Mike ,

I want to know where you get it from ?

From Art.T 's paper at SGF2012 ?

Just Guess .

Ksharp

MikeZdeb
Rhodochrosite | Level 12

hi ... no, just from ...


http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002571877.htm

and from working with DDE and SAS with Excel and Word in the past


in searching around today,  I see another good paper ... "Importing Data from Microsoft Word into SAS"

http://www.pharmasug.org/download/papers/CC18.pdf

with a mention of ...

data _null_;

      file word;

      put '[FileOpen .Name = "' "C:\PharmaSUG2009\Example.doc" '"]';

      put "[EditSelectAll]";

      put "[EditCopy]";

      put '[FileClose]';

   run;

but no discussion of the clipboard access method (rather, it's use SAS to cut from Word, paste into Excel ... that's also an interesting application, just not what's needed here)

ps  I haven't seen Art's paper, but if I had gotten the idea from Art, I would have said ... hey, I got this idea this neat paper by Art Tabachneck  and others ...


"Copy and Paste Almost Anything" ... http://support.sas.com/resources/papers/proceedings12/238-2012.pdf

art297
Opal | Level 21

@Ksharp: for future reference .. nothing of mine needs to be credited to me and virtually everything I write stems from the ideas of others and, honestly, I often don't remember where the original ideas came from.  Just that I don't think they have gotten sufficient publicity, thus could use some assistance in that regard.

Mike and I know each other quite well and he does a much better job than I do in crediting others.  I definitely didn't discover the clipbrd method .. that was done by some unknown (to me) SAS Institute employee.

Further, very little help would get conveyed on the forum or SAS-L if everyone had to credit the source where they first got the idea.  It would add a level of additional work that would force most, if not all of us, to stop posting all together.

@Mike: Thanks for posting the link to the paper.  While not really relevant to this thread, it does show an interesting use of the clipbrd method and features I would like to see incorporated in PROC IMPORT.  If anyone is interested, the code and powerpoint are available at:

http://www.sascommunity.org/wiki/Copy_and_Paste_Almost_Anything

Duong
Obsidian | Level 7

Hi all

This question becoming more common for those who work in the Pharmaceutical Industry.

There are a number of papers on this topic:

http://www2.sas.com/proceedings/sugi31/066-31.pdf

http://www.pharmasug.org/download/papers/CC18.pdf

http://www.pharmasug.org/cd/papers/TT/TT13.pdf

I disagree Cynthia statement's above - "almost impossible to read the RTF file directly with SAS".

In fact extracting an RTF table into SAS is quite simple. I have written a paper about this.

http://www.lexjansen.com/phuse/2008/po/po03.pdf

That is by using an RTF regular expression you can easily strip out the RTF control words. Also

Cynthia, in clinical trial reporting it is very typical that one programmer do the table and another QC it.

And because the table could be re-ran many times, programmers preferred a programmatic way of

doing the QC (refer to Hagondoorn paper above)

Tom suggestion to save the RTF as text then read into SAS is really not appropriate for this purpose.

Because separating the columns into variables would be a challenge!! What you want is column1

go to var1 and column2 to var2 etc. So if the file is in text it is not easy to identify where the columns

separation are.

Another alternative is to copy and paste into Excel then use DDE. This could be done

automatically (ie on the fly) I believe. I have not looked in depth into this. The reason is you need PC SAS

for this method and many bigger pharmaceutical companies uses UNIX SAS.  

As for the clipboard I have not tried this but it seem the same as copy and save as text method. If it is

then you would have the same issue.

Regards

Duong

Haikuo
Onyx | Level 15

Same question again: where to download %rtfparser?

Thanks,

Haikuo

data_null__
Jade | Level 19

How about taking the regex from the [1] Burke S. M. "RTF Pocket Guide" reference that appears to be the KEY to process

/\\[a-z]+(-?[0-9]+)? ?/

and point that at the _INFILE_ variable and see what you get.  IT will probably be everything you need, or pretty close.

You don't need no stinkin' macro Smiley Wink

Haikuo
Onyx | Level 15

Thanks, Null. Will try that.

Haikuo

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 6484 views
  • 7 likes
  • 10 in conversation