Desktop productivity for business analysts and programmers

Import excel files with multiples tab that are linked

Reply
Frequent Contributor
Posts: 94

Import excel files with multiples tab that are linked

I get a excel workbook with 15 tabs from another department in our company.  The header rows for tabs 2-14 are linked to the header row in tab 1.  This way, if a column name is changed, the person only has to change the 1st tab, which will then change the other tabs. When I import the 2nd tab, the resultant sas dataset column names are all "A".  Is there a way for me to be able to import the tabs and still get the coluimn names? Thanks!

Grand Advisor
Posts: 9,584

Re: Import excel files with multiples tab that are linked

Not Sure if it could work, Can you save it as CSV and check ?

Grand Advisor
Posts: 17,383

Re: Import excel files with multiples tab that are linked

SAS should be automatically using the current value of the formula in a cell when it imports the file. Which means you should be getting names.

Does the excel file have macros or only compute on update on?

Can you post a sample workbook that doesn't work for you and the code you use to import it? If your info is confidential you can mock one up.

Esteemed Advisor
Esteemed Advisor
Posts: 7,232

Re: Import excel files with multiples tab that are linked

You are quite correct Reeza, Excel stores in the underlying XML both the formula and the current value returned by the formula:

-<row r="1" x14ac:dyDescent="0.25" spans="1:3">-<c r="A1" t="str"><f>Sheet1!A1</f><v>abc</v>

So it should read in fine (and does after a small test), so I assume something else is going on with the Spreadsheet (maybe older version?)

Frequent Contributor
Posts: 94

Re: Import excel files with multiples tab that are linked

I will need to work on a mockup of the file. I apologize and should have stated that the file is an .xlsx file (Excel 2007).  I finally learned yesterday how to use PROC IMPORT to import .xlsx file.  Additionally, the eventual plan is that there will be 40+ of these files (each with 15 tabs).  So to have to save each file as a .csv each time is something I prefer not to do. The request is for me to import every tab of each .xlsx workbook into one big dataset for analysis. Here is the code I use to import the file: Proc Import Datafile='/home/UKHeatmap 2015.xlsx' Out=agnostic DBMS=xlsx Replace; Sheet= 'Heat Map Channel Agnostic'; Run; I cannot figure out why the column headers from the other tabs come through.

Esteemed Advisor
Esteemed Advisor
Posts: 7,232

Re: Import excel files with multiples tab that are linked

Well, you don't need dbms=xlsx on that statement, it worked for me without it (from an xlsx).  I would strongly advise that you go back to source and ask them to provide a proper data transfer.  Excel is not a data transfer format, and you will just be building yourself trouble up for the future if you accept it.  Me, unless there is an absolute, positively cannot do anything about scenario and management accept that each time this data is handled there will be time/money costs, then I would refuse Excel data.  Even something simple like, down the line a number could be stored as text this can completely ruin your programming and give you hours of fun trying to figure out what has gone wrong, and with Excel, things will go wrong.  Go back to the source and get the data from there.

Frequent Contributor
Posts: 94

Re: Import excel files with multiples tab that are linked

I am operating SAS from UNIX and am told by SAS that I must use the XLSX engine.  Once I did, then I was able to import.  So yes, I do need the dbms=xlsx As for requesting the source for proper data transfer, I have done this.  However, I am not the only person using this source and others' work are dependent on the xlsx format.  It's a LARGE corporation so, yes it is difficult from a logistic standpoint, and not from cooperation perspective, I appreciate your feedback/suggestion and fully agree.  HOWEVER, it is unrealistic for me to say I absolutely refuse to do this until you give me the file in X-format.  I have to work with what I am given and not expend energy or time on things out of my control.

Community Manager
Posts: 2,697

Re: Import excel files with multiples tab that are linked

Another approach that I've had success with is the EXCEL or PCFILES libname engine.  Sometimes you can work a two-step approach, where you use:

libname myexcel PCFILES 'path-to-my-xlsx/excel.xlsx';

and then PROC DATASETS to get the sheet names and column names.

Then feed that information into your PROC IMPORT and any post-process DATA step that you might need to munge the data columns into the proper names/types.

Chris

Frequent Contributor
Posts: 94

Re: Import excel files with multiples tab that are linked

Chris - Thanks for the advice. Following your instructions, I ran this code:  Libname myexcel PCFILES'/home//G2C/Heatmap 2015.xlsx'; I got this error message: ERROR: Incorrect syntax for this LIBNAME statement. This engine does not accept a physical name. ERROR: Error in the LIBNAME statement. Also, is there an example of the PROC DATASETS to get the sheet and column names? Thanks!

Community Manager
Posts: 2,697

Re: Import excel files with multiples tab that are linked

I missed the fact that you're on Unix.  The PCFILES library engine requires the PC Files Server to be available on a Windows machine somewhere.  On SAS for Windows, this is automatically invoked with no additional setup.  On Unix, you have to specify the node name/port within the libname syntax in order to "delegate" the work of reading the Excel file to that PC Files Server machine.  If that's not possible for you to set up, I apologize for leading you down a primrose path.

Here's an example macro that I've used in the past to collect all of the sheet names from an Excel "data set" using the PCFILES engine.

/* Given a full path to an Excel file, append the filename and sheet names */
/* to a data set name WORK.FILESANDSHEETS */
%macro FindAllSheets(ExcelFilename= /* full path the Excel file */);

   
/* pull the Sheet information out of PROC DATASETS */
    ods select Datasets.Members;
    ods output Datasets.Members=work.members;

   
/* get the name of the sheets in this workbook */
   
/* uses the PCFILES engine and relies on the Autostart */
   
/* capability of the PC Files Server, 64-bit Windows */
    libname excelpi pcfiles path=&
ExcelFilename.;

    proc datasets lib=excelpi memtype=data nodetails ;
    run;

    ods select all;

    data filesAndSheets;
      set filesAndSheets
                work.Members(keep=Name in=new);
        if new then do;
          position+
1;
            Filename=&
ExcelFilename.;
        end;
        if Filename ne
' ';
    run;
    libname excelpi clear;
%mend;

Chris

Grand Advisor
Posts: 17,383

Re: Import excel files with multiples tab that are linked

It doesn't sound like have an issue importing your sheets but only with the column names?

jen123 wrote:

When I import the 2nd tab, the resultant sas dataset column names are all "A".

Run a proc contents on the data set imported and verify the column names. It's not possible to have variable names repeat so having all the variables named A isn't possible. Look at the column name vs the column labels.

proc contents data=imported; run;

Frequent Contributor
Posts: 94

Re: Import excel files with multiples tab that are linked

Hi reeza, My mistake - I incorrectly typed what I meant. When I import the file, the column with formula (vlookup/link), rather than giving me the text for that row/observation, it gives me only "A". So for example, in the excel file column "L5CodeName" is a vlookup to another tab in the workbook.  When I import the file into SAS, the data under column LSCodeName are all "A". Hope I made sense this time.

Ask a Question
Discussion stats
  • 11 replies
  • 666 views
  • 0 likes
  • 5 in conversation