08-12-2013 10:03 AM
I have a large number of word files, with no specific length/format. The ask is to read each file into a single variable in SAS. So every file will have a corresponding SAS dataset with only one observation and one variable.
08-12-2013 01:19 PM
Do you actually want all of the font and markup information contained in the file or just the text you would read? Are there any tables in the documents? And which format are the Word documents in, DOC, DOCX, RTF or something else?
08-12-2013 02:01 PM
I would like to import the visible/printable text. There could be tables in some of them, some might not. Let us assume that the documents are in 'docx' format.
01-19-2014 10:19 PM
It seems that SAS does not provide a procedure to import data directly from Microsoft Word into SAS. And I do not know whether there are third party Word processing SDKs that can be used for achieving this effect. Hope you can find more related information at following links : By the way, the data that you want to import from Word do not include the information contained in image file, right?
01-19-2014 10:24 PM
Do you have SAS eMiner?
Can you provide more info on your process? Are the text files similar in nature, are you trying to extract all the info or only parts.
I'd probably write a program to use DDE to open each file, copy to an actual .txt file, and then read that in.
Given your limited info, that's my limited answer