DATA Step, Macro, Functions and more

Word doc

Super Contributor
Posts: 647

Word doc

I have 10 word documents (for eg) in a folder. Each doc has multiple pages in it.
I want page 1 only of every single word doc and combine them into a new word doc(final.doc).

So, final.doc should have 10 pages:

page1 of doc1 + page1 of doc2 + page1 of doc3 +...... page1 of doc 10.

Is there a automated way to do this ?
Posts: 8,740

Re: Word doc

Word documents are stored in a proprietary format according to whichever version of Microsoft Office and Word you are using. SAS has an engine for reading Excel proprietary format, but does not have any method for reading a proprietary Word document. This paper does show an interesting way to read Microsoft Office 2003 Word XML -- but the focus of the paper is how to get the information out of the XML and into a SAS dataset, which is not what you describe.

Even if you saved the 10 Word documents in RTF format, it would be a hard DATA step program to write in order to parse all the RTF control strings in order to pick out just the first page of each document.

How were the Word documents originally created?? If they were originally RTF files created with ODS, you might be able to rerun the SAS job that created the RTF files, but limit the number of obs to only what would fit on page 1 of each output file.

Is this something you might try to do with a Word macro or VB Script?

Ask a Question
Discussion stats
  • 1 reply
  • 2 in conversation