DATA Step, Macro, Functions and more

import a word doc then output with same format

Accepted Solution Solved
Reply
Super Contributor
Posts: 271
Accepted Solution

import a word doc then output with same format

Hello everyone,

 

I want to import a MS word .doc file into SAS ,after do simple change then export it to a new .doc file. but I want keep same format as the old one.format including font size,color paragraphs everything.

can I do it in SAS? or any other method?

 

I attached an sample doc file(sample.doc) here. I want import it into SAS and change June 12, 2006 to June 16, 2016 then export to a new file named newfile.doc with same format.

 

Thank you! 


Accepted Solutions
Solution
‎01-29-2018 12:18 PM
Super Contributor
Posts: 271

Re: import a word doc then output with same format

[ Edited ]

first method: SAS

      

I open the .doc file and save it as .html file then do the following code:

 

filename in "C:\temp\sample.htm";

data sashelp.testsample;
 infile in pad missover lrecl=300;
 input source $200.;
 run;

 
data need;
set sashelp.testsample;
year=year(date());
call symput('applyYear',strip(year-1));
run;

data need;
set need;
if index(source,"June 12, 2006") then do;
source=tranwrd(source,'2006',year);
end;

run;

  filename out "C:\temp\samplenew.doc";
data _null_;
 file out;
 set need;
 put source ;
 run;

The new .doc file is I need, change the view from web view to print view when open it in MS word.


 

 

 

Second method: VBA:

 see attachment sampleVBA 

 

 

 the two arrows will change the year between 2006 and 2016.

 

 the VBA is very simple .need add more feature to make the function better.

 

 for example if I want change the year to current year ......

 

Thanks!

View solution in original post


All Replies
Super User
Posts: 13,941

Re: import a word doc then output with same format

Posted in reply to GeorgeSAS

GeorgeSAS wrote:

Hello everyone,

 

I want to import a MS word .doc file into SAS ,after do simple change then export it to a new .doc file. but I want keep same format as the old one.format including font size,color paragraphs everything.

can I do it in SAS? or any other method?

 

I attached an sample doc file(sample.doc) here. I want import it into SAS and change June 12, 2006 to June 16, 2016 then export to a new file named newfile.doc with same format.

 

Thank you! 


Basically ain't gonna happen with SAS. DOC or DOCX file formats are not intended for data interchange SAS is intended to read data. You would have to write an entire Word document parser to do what it sounds like you want to accomplish.

 

You might be better of with some sort of Microsoft Visual Basic or VBA code or macro.

Super User
Posts: 24,004

Re: import a word doc then output with same format

Posted in reply to GeorgeSAS

Doc or DocX?

If you're changing a single value in a docx files there are ways...a doc file however is pretty locked down. 

 

I second the suggestion of using a VBS or VBA script instead. You can call it from SAS or even write it from SAS if so desired.

 

Super Contributor
Posts: 271

Re: import a word doc then output with same format

[ Edited ]

I will change the sample.doc to sample.xml. then I will import it into SAS. after modification then export it to newfile.doc.

now the thing is how to import sample.xml into SAS, I am working on it now.

 

here is the sample.xml in attachement, please help the code of importing it to SAS.

 

 

(

  • The file sample.xml does not have a valid extension for an attachment and has been removed. sas,txt,csv,zip,pdf,ics,sx,sxs,doc,docx,xls,xlsx,egp,sav,sas7bdat,ctm,ctk,rtf,py are the valid extensions.)

it can't be attached, please just simply rename .doc to .xml.

 

 

data need1;
infile "c:\temp\sample.xml"  truncover pad ;
input source $10000.;
run;

 

Thanks!

Super User
Posts: 24,004

Re: import a word doc then output with same format

Posted in reply to GeorgeSAS

That works for DOCX files, after you unzip it, but not for DOC files. Which do you have?

 

 

GeorgeSAS wrote:


I will change the sample.doc to sample.xml. then I will import it into SAS. after modification then export it to newfile.doc.

now the thing is how to import xml into SAS, I am working on it now.

 

Super Contributor
Posts: 271

Re: import a word doc then output with same format

sample.doc file

Super User
Posts: 24,004

Re: import a word doc then output with same format

Posted in reply to GeorgeSAS

@GeorgeSAS You attachments aren't working. Save it as xml and change the extension to txt to upload it. 

 

Super Contributor
Posts: 271

Re: import a word doc then output with same format

[ Edited ]

Thanks!

 

 

data need1;
infile "c:\temp\sample.xml" truncover pad ;
input source $10000.;
run;

 

Super User
Posts: 24,004

Re: import a word doc then output with same format

Posted in reply to GeorgeSAS

So that text file has content, including the word Georgia in some places. If I change the txt to XML, there's no more word Georgia in the file, so as far as I can see, that approach will not work. 

 

You still haven't said if this is a doc or docx file. 

Super Contributor
Posts: 271

Re: import a word doc then output with same format

not just simply change .doc to .xml.
opend .doc file then save as .xml file


Thank!
Super User
Posts: 24,004

Re: import a word doc then output with same format

[ Edited ]
Posted in reply to GeorgeSAS

If I do that I'll get a different file. I'm saying YOU should do that, create YOUR XML file, change the extension to txt and upload it.

How are you planning to automate that step by the way? And if you have to use VBA why bother with SAS at all...?

 


GeorgeSAS wrote:
not just simply change .doc to .xml.
opend .doc file then save as .xml file


Thank!

 

Solution
‎01-29-2018 12:18 PM
Super Contributor
Posts: 271

Re: import a word doc then output with same format

[ Edited ]

first method: SAS

      

I open the .doc file and save it as .html file then do the following code:

 

filename in "C:\temp\sample.htm";

data sashelp.testsample;
 infile in pad missover lrecl=300;
 input source $200.;
 run;

 
data need;
set sashelp.testsample;
year=year(date());
call symput('applyYear',strip(year-1));
run;

data need;
set need;
if index(source,"June 12, 2006") then do;
source=tranwrd(source,'2006',year);
end;

run;

  filename out "C:\temp\samplenew.doc";
data _null_;
 file out;
 set need;
 put source ;
 run;

The new .doc file is I need, change the view from web view to print view when open it in MS word.


 

 

 

Second method: VBA:

 see attachment sampleVBA 

 

 

 the two arrows will change the year between 2006 and 2016.

 

 the VBA is very simple .need add more feature to make the function better.

 

 for example if I want change the year to current year ......

 

Thanks!

Super User
Posts: 10,570

Re: import a word doc then output with same format

Posted in reply to GeorgeSAS

The proper tool for manipulating Word documents is MS Word. Use a VBA script with Word.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super Contributor
Posts: 271

Re: import a word doc then output with same format

Posted in reply to KurtBremser

Thanks!

 

I will try both way!

 

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 296 views
  • 1 like
  • 4 in conversation