DATA Step, Macro, Functions and more

How to get the data from Word to SAS dataset by converting Headings in the word as variables of SAS

Reply
Occasional Contributor
Posts: 9

How to get the data from Word to SAS dataset by converting Headings in the word as variables of SAS

Hi Mates,

 

I need help from you guys. I want to get the text of word document into SAS datasets by converting the headings in the documents as variables for the SAS datasets.

 

Example of the Word document:

Name:

John

Sex:

Male

Age:

25 years

Address:

#2-3-4-5, 2nd cross,

1st Main, NY

 

Output I need:

obs Name   Sex     Age            Address

01   Jhon     Male  25 Years    #2-3-4-5, 2nd cross,

                                               1st Main, NY

 

Can any one help me to find a solution for this?????

Thank you

Super User
Super User
Posts: 7,955

Re: How to get the data from Word to SAS dataset by converting Headings in the word as variables of

Posted in reply to satish78652

From Word, File->Save As-> save the file as .txt.  Then write a datastep to read the text file and output to your given requirements:

data want;
  length buff name sex address $2000;
  infile "thetextfile.txt";
  input buff $;
  if buff="Name:" then input name $;
...
run;

The real question is why are you using an output for human review file format such as Word as data.  Return to the source data and go from there, thats really the only "good" way.

Occasional Contributor
Posts: 9

Re: How to get the data from Word to SAS dataset by converting Headings in the word as variables of

Thank you for the reply. In the programme you provided i need to specify the variables manually, my actual problem is that i am looking for macro which can extract the headings or bookmarks as variables of SAS dataset.

 

Thank you.

Super User
Posts: 7,782

Re: How to get the data from Word to SAS dataset by converting Headings in the word as variables of

Posted in reply to satish78652

Since nothing in the Word document provides any clues about column attributes, you can't set them automatically. So you have to do a lot of work anyway. The names are the least problem.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 129

Re: How to get the data from Word to SAS dataset by converting Headings in the word as variables of

Posted in reply to satish78652

Well I sincerely hope you find what you are looking for. Please tell me if you find it.

Not later than yesterday I had to do the same thing.

I copied the Word data to an appropriate text editor, converted the special characters to adequate ones, ensured the proper tab delimitation and missing replacement and imported it as a formatted text file and performed an extensive quality check.

 

I don't know how but hey I think what you want is doable. A *.docx file is nothing else than a zipped XML file. I think it's feasible to hack yourself into it and extract the formatted tables.

 

I wish you big success, here are some starting points: 1   2

 

Cheers

________________________

- Cheers -

Ask a Question
Discussion stats
  • 4 replies
  • 127 views
  • 0 likes
  • 4 in conversation