BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Belle
Obsidian | Level 7

Dear Members,

My final goal is to merge two tables, one is from Excel, and the other one is from Word; however, I am having difficulty in importing a table from Word into SAS.  In Word, you can have hardenters in the cells. For example, in the Address column of the tables below (Table 1), there is a hardenter after San Francisco. I tried to copied this table into Excel, it turned out there were multiple rows for ID 1. The whole table was messed up so that I cannot merge into the Excel table.

Are there any ways to read Word table into SAS directly without messing up each row? Or if there are any ways to convert Word table into Excel so that I can read them into SAS, and then merge them?

Table 1: Word Table

ID

 

Name

 

Address

 

dependents

 

Relationship

 

1

 

Mary

 

San  Francisco

CA  - 94115

 

Bob

Jenny

 

Son 

Daughter

 

2

 

Robert

 

234  Stress

Oakland

CA  -94623

 

JJ

Christina 

 

Daughter

Daughter

 

5

 

Lucia

 

Los  Angeles

 

 

 

Table 2: Excel Table

     IDHoursPay rate
24040
34035
43532
54050

Thank you all, Any supplementary information is highly appreciated.

Belle

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

You only missed one step.  On the replace screen, under find, click on 'more' at the bottom left of the screen.  Then click on 'special'.  The click on 'Paragraph Mark' which, on my version of Word, is the first option.

Then, do like you did, indicating ~ as the replacement character, click on replace all, then when it finishes indicate that, NO, you don't want it to continue with the rest of the document.

View solution in original post

16 REPLIES 16
dhana
Fluorite | Level 6

Hi Belle,

Try converting your word document table to an excel sheet. Please go through the following link. It has enough information to convert word table to excel.

http://www.ehow.com/how_2096574_convert-microsoft-word-table-excel.html

I have tried converting a word table to excel, it worked for me.

Hope this helps...!

Thanks

Dhanasekaran R

Belle
Obsidian | Level 7

Hi Dhanasekaran,

Thanks for the information; however it does not work for my case because it becomes mutiple rows as I mentioned. Table will be messed up after reading into SAS.

Here is what I got after I converted it into Excel.

test.png

Thanks again for the help.

art297
Opal | Level 21

Belle,

If the word file can be highlighted (i.e., select all) and copied to your clipboard (i.e., Ctrl-c), I may have a solution for you.  It is the topic of a presentation that I'll be giving at next year's SGF, but we have never tested it on the particular kind of file you describe.

The powerpoint, paper and code are available at http://torsas.ca/page18.php which is the website for my local user group, namely the Toronto Area SAS Society.

If you don't mind, I'd really like to test the code on your particular file.  If the word document can be shared, please send me a copy at art297 at rogers dot com.

Art

Belle
Obsidian | Level 7

Hi Art,

I cannot share the original tables with you because of company's policy. I wish I could share it. Smiley Sad

Actually the table I showed here were fake tables. They contain the information/problem that I am facing. I'll send you these two tables.

I really appreciate your help, hopefully this problem can be solved.

Belle

art297
Opal | Level 21

Belle,

First the good news: I was able to import your data.  You may not agree with how I eventually structured the file, but that part should be relatively easy for you to change.  Since SAS doesn't have merged cells, I split merged cells into multiple records.

Obs   ID  Name    Address                                       dependents   Relationship

1        1     Mary    San Francisco CA - 94115          Bob               Son

2        1     Mary    San Francisco CA - 94115          Jenny             Daughter

3        2     Robert   234 Stress Oakland CA -94623  JJ                  Daughter

4        2     Robert   234 Stress Oakland CA -94623  Christina        Daughter

5        5     Lucia  Los Angeles

It definitely wasn't as straight forward as I would have liked, but only took seconds to accomplish once I figured out what the problem was.  MS Word doesn't really have soft returns in its merged cells .. they copy and paste as non-merged cells with hard carriage returns.  Thus the first task that had to be accomplished was to highlight the table in Word and replace all Paragraphs with tildes (i.e., ~).  If you don't know how to do that let me know and I'll describe the steps  That entire process literally took about 10 seconds.

Then I highlighted and copied the now modified table and ran the code I gave you the link for this afternoon. The following are the settings I used at the top of the code:

options NOQUOTELENMAX;

options datestyle=mdy;

filename clippy clipbrd;

filename revised temp;

%let transpose=NO;

%let columns=;

%let rows=;

%let look_for=;

%let skip_lines=;

%let hrows=1;

%let spaces="      ";

%let first_data_row=2;

%let var_renames=;

%let var_labels=;

%let var_share=;

%let var_prefix=;

%let var_suffix=;

%let var_drop=;

%let var_upcase=;

%let var_missing=;

%let var_formats=;

%let var_informats=;

%let var_units=;

%let guessingrows=;

%let outfile=have;

I.e., I didn't have to change anything or set anything.

Then I ran the following datastep:

data want (drop=_:);

  length dependents $15;

  length Relationship $15;

  set have (rename=(dependents=_dep

                    Relationship=_rel));

  address=translate(address," ","~");

  do _i=1 to max(countw(_dep,"~"),countw(_Rel,"~"));;

    dependents=scan(_dep,_i,"~");

    Relationship=scan(_rel,_i,"~");

    output;

  end;

run;

I ran one extra datastep as I figured that you wanted the variables in the same order as you had them in the original file:

data want;

  retain ID Name Address dependents Relationship;

  set want;

run;

Art

Belle
Obsidian | Level 7

Hi Art,

Thanks a lot for trying my tables.

I tried to follow your steps, but I did encounter a problem on replacing all Paragraphs with tildes (i.e., ~). Here is what i did: Highlight whole table -> click replace -> type "~" in REPLACE option; My question is what should I put in FIND option? Did I select wrong option?

Replace.png

Thanks again.

art297
Opal | Level 21

You only missed one step.  On the replace screen, under find, click on 'more' at the bottom left of the screen.  Then click on 'special'.  The click on 'Paragraph Mark' which, on my version of Word, is the first option.

Then, do like you did, indicating ~ as the replacement character, click on replace all, then when it finishes indicate that, NO, you don't want it to continue with the rest of the document.

Belle
Obsidian | Level 7

Thanks Art, it works well.

art297
Opal | Level 21

Belle,

You didn't mark the question as having been answered, so others will probably still spend time trying to come up with alternative solutions.  If that is what you want, leave it as being unanswered, otherwise you really should click on the icon that indicates that the question has been answered.

Regardless, I'm still curious about what worked well.  I presume that you meant the find/replace in Word.  I am mostly interested in whether you were able to import the file and, if so, whether you used the code I proposed or proc import.

Belle
Obsidian | Level 7

Hi Art,

Sorry for not marking the questions since I am very new to this discussion board. I have marked it.

Yes, I meant the find/replace in word worked well. Now, I have followed your instruction without changing anything; however, I got error message which said '

File WORK.HAVE.DATA does not exist'. I think I did not run the Marco correctly. How did you set up the values for the Macros (%flipfile; %expandr (type,string); %filarray (type,string);)? Below are the step that I tried after i replaced paragraph:

Step 1: Copy the Macro in to SAS

(begining of the macro)

options NOQUOTELENMAX;

options datestyle=mdy;

filename clippy clipbrd;

.

.

.

%macro filarray (type,string);

if scan("&string.",i," ") ne "" then

&type(scan(scan("&string.",i," "),1,"~"))=

scan(scan("&string.",i," "),2,"~");

%mend filarray;

(end of the marco)

Step 2: Highlight the word table -> Ctrl + C

Step 3: Run the Marco in Step 1

Step 4: Run your codes below

(begining of the codes)

data want (drop=_:);

  length dependents $15;

  length Relationship $15;

  set have (rename=(dependents=_dep

                    Relationship=_rel));

  address=translate(address," ","~");

  do _i=1 to max(countw(_dep,"~"),countw(_Rel,"~"));;

    dependents=scan(_dep,_i,"~");

    Relationship=scan(_rel,_i,"~");

    output;

  end;

run;

(end of the codes)

Sorry for asking this Marco question, I am still learning the Marco.

Thanks

art297
Opal | Level 21

The code was designed so that one doesn't have to know or understand macros.  The only critical parts of the code are defining the macro variables up front.

Specifically, the code was designed so that one simply copies the entire set of code (as posted on the TASS (i.e., www.torsas.ca) website, ensue that all of the macro variables are set correctly, copy the file you want to import, then press run.

The most critical aspect that might have to be changed, as it is system dependent, is the macro variable

%let spaces="      "; *number of consecutive spaces that should be translated

                       to represent a horizontal tab;

In on computer that is 4 spaces that have to be included between the pair of double quotes, but it may be 6 on some systems.

I would need to see the code you actually submitted, and the log, in order to answer your question.

However, at this point you don't really need my code to do the job:  proc import should work quite nicely for you as well.

Belle
Obsidian | Level 7

Hi Art,

I am confused. I ran the three macros (%flipfile; %expandr (type,string); %filarray (type,string);). The questions is if I need to set up the values for each macro. for example,

(1) When I run %flipfile; do I need to assign vaules?

(2) For %expandr (type,string); what do type and string equal to?

In my first ran, I did not assign anything for them. Here are the messages from log:

......  (run first two macros)

......

1227  %macro filarray (type,string);

1228  if scan("&string.",i," ") ne "" then

1229  &type(scan(scan("&string.",i," "),1,"~"))=

1230  scan(scan("&string.",i," "),2,"~");

1231  %mend filarray;

1232

1233

1234

1235  data want (drop=_:);

1236

1237    length dependents $15;

1238

1239    length Relationship $15;

1240

1241    set have (rename=(dependents=_dep

1242

1243                      Relationship=_rel));

ERROR: File WORK.HAVE.DATA does not exist.

1244

1245    address=translate(address," ","~");

1246

1247    do _i=1 to max(countw(_dep,"~"),countw(_Rel,"~"));;

1248

1249      dependents=scan(_dep,_i,"~");

1250

1251      Relationship=scan(_rel,_i,"~");

1252

1253      output;

1254

1255    end;

1256

1257  run;

NOTE: Numeric values have been converted to character values at the places given by:

      (Line):(Column).

      1245:21   1247:25   1247:42   1249:21   1251:23

NOTE: Character values have been converted to numeric values at the places given by:

      (Line):(Column).

      1245:11

NOTE: The SAS System stopped processing this step because of errors.

WARNING: The data set WORK.WANT may be incomplete.  When this step was stopped there were 0

         observations and 3 variables.

WARNING: Data set WORK.WANT was not replaced because this step was stopped.

NOTE: DATA statement used (Total process time):

      real time           0.03 seconds

      cpu time            0.01 seconds

Thanks for the help.

Belle

Belle
Obsidian | Level 7

Hi Art,

Another question, when you mentioned that I did not need to use your codes. Did you mean marcos, or you mean your codes start from data want (drop=_:);? Also how should I use proc import? Do I still import table from Word to SAS by using proc import? Sorry for all the confusion.

Thanks again

Belle

art297
Opal | Level 21

The proc import method I used included:

1. Replacing the paragraph marks with ~s in word

2. Copying the table and pasting it into Excel, then saving it as a.xls in Excel97 format

3. Running the following code:

proc import datafile='c:\art\a.xls'

     out=have

     dbms=xls

     replace;

     getnames=YES;

     datarow=2;

run;

data want (drop=_:);

  length dependents $15;

  length Relationship $15;

  set have (rename=(dependents=_dep

                    Relationship=_rel));

  address=translate(address," ","~");

  do _i=1 to max(countw(_dep,"~"),countw(_Rel,"~"));;

    dependents=scan(_dep,_i,"~");

    Relationship=scan(_rel,_i,"~");

    output;

  end;

run;

data want;

  retain ID Name Address dependents Relationship;

  set want;

run;

But please send me, off line (art297 at rogers dot com) a copy of the log you got when you ran the code that I initially proposed.

Art

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 16 replies
  • 4143 views
  • 3 likes
  • 4 in conversation