SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Converting documents from XLSX to CSV so that they can be read by SAS DI Studio

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 90
Accepted Solution

Converting documents from XLSX to CSV so that they can be read by SAS DI Studio

Hello,

Each week, I manually convert around 25 reports from XLSX-format til CSV-format, so that they can be read into a SAS DI Studio job. Is there a way to automate the conversion from XLSX to CSV?

Thanks for your time.


Accepted Solutions
Solution
‎06-28-2017 11:25 AM
Super User
Posts: 5,441

Re: Converting documents from XLSX to CSV so that they can be read by SAS DI Studio

Posted in reply to EinarRoed

It sounds that you don't have a license for SAS/ACCESS to PC File Formats.

Execute:

PROC SETINIT;

RUN;

And you'll see in the log which SAS modules are licensed.

If you have any of the modules suggested above (ACCESS to PC File Formats, ODBC or OLEDB), go through with using these using the documentation.

Do you have EG? If yes, have you tried to do this using projects?

If not, I think you need to look for a non-SAS solution to make this happen, such as VB, .net or a third party tool.

Data never sleeps

View solution in original post


All Replies
Super User
Posts: 5,441

Re: Converting documents from XLSX to CSV so that they can be read by SAS DI Studio

Posted in reply to EinarRoed

Sounds like an unnecessary way round to do csv files. Is that because you don't have SAS/ACCESS to PC File Formats (or ODBC) licensed?

If you have, you can import them directly into SAS using DI, either by libname or proc import.

Just speculating here: if you don't have the necessary license, but you have EG: you could maybe create a project that imports these files into SAS (or csv), and have that executed before the ordinary DIS job.

Data never sleeps
Respected Advisor
Posts: 3,799

Re: Converting documents from XLSX to CSV so that they can be read by SAS DI Studio

Posted in reply to EinarRoed

If there is no more direct way for you to import the XLSX data into SAS you can write a script program to run on windows to open each XLSX and SAVEAS CSV. Just push the button as they say.  I use VBSCRIPT this type of "Office Automation" because I have some limited experience using VBSCRIPT but it can be done with other windows scripting languages. 

Respected Advisor
Posts: 4,173

Re: Converting documents from XLSX to CSV so that they can be read by SAS DI Studio

Posted in reply to EinarRoed

If SAS/Access to PC File Formats is licensed at your site then you can even register Excel sheets in SAS Metadata as shown here

Read in an excel sheet in SAS Data Integration Studio 4.2 | Ken Pintelon's Sas Code

I personally "hate" Excel because users can just mess around too much with it. That's why I normally script a Proc Import and convert the Excel sheet to a csv also adding some post processing like getting rid of control characters and then define a proper External File metadata object for this  cleansed csv.

Occasional Contributor
Posts: 17

Re: Converting documents from XLSX to CSV so that they can be read by SAS DI Studio

Posted in reply to EinarRoed

Hi,

Another option if available to your installation is to connect to the Excel file via SAS/ACCESS to OLEDB.  A quick Google search will get you the required connection string (I'm out of office so can't supply it).  I did a job a while back which did this.  It is similar to the link supplied by Patrick but doesn't register a new library.  It registers the excel file as a fileref.

  1. Register a new file in DI and select the type of file to custom (I think it is called that?).  The file type will allow you to write custom code in DI.  You will need to specify the file's columns and attributes etc.
  2. In the code window, either create a libname to the excel file via oledb or use sql pass-through.  Specify the output file by the DI macro variable &_output (0 to N)
  3. Now use the file in any job and get the benefits of data lineage, packaging etc.


Some caveats;

  • Generally you will get locking issues if the file is opened by someone else.  That is just Excel being Excel and a main reason for not registering as a library.
  • If using pass-through the SQL is horrible. 

This way you can bypass the need to create csv files, just use the excel files as sources in the required jobs.

Frequent Contributor
Posts: 90

Re: Converting documents from XLSX to CSV so that they can be read by SAS DI Studio

Posted in reply to EinarRoed

Thanks for the feedback. Unfortunately I haven't been able to find a solution to this yet, though.

Currently I manually convert all the XLSX-files to CSV, then run a DI Studio job that reads them as CSV, edits the data, and writes out new CSV-files. The CSV-files are then transferred to another party that specifically needs them as CSV. So, bypassing the need to create CSV-files isn't an option in this case.

If i try to register a Microsoft Excel Library I can't get past the "Specify the server and connectivity information" part of the wizard, because there simply aren't any options for me to select there (even "New" is greyed out).

If I try using PROC IMPORT with DBMS=EXCEL, I get the message " ERROR: DBMS type EXCEL not valid for import. ".

Any further advice would be appreciated, thanks. Smiley Happy

P.S. We're running SAS 9.2 and DI Studio 4.21.

SAS Employee
Posts: 75

Re: Converting documents from XLSX to CSV so that they can be read by SAS DI Studio

Posted in reply to EinarRoed

The SAS Data Admin Guide includes this general topic about registering Excel files: SAS(R) 9.3 Intelligence Platform: Data Administration Guide, Second Edition

Does that help?

Solution
‎06-28-2017 11:25 AM
Super User
Posts: 5,441

Re: Converting documents from XLSX to CSV so that they can be read by SAS DI Studio

Posted in reply to EinarRoed

It sounds that you don't have a license for SAS/ACCESS to PC File Formats.

Execute:

PROC SETINIT;

RUN;

And you'll see in the log which SAS modules are licensed.

If you have any of the modules suggested above (ACCESS to PC File Formats, ODBC or OLEDB), go through with using these using the documentation.

Do you have EG? If yes, have you tried to do this using projects?

If not, I think you need to look for a non-SAS solution to make this happen, such as VB, .net or a third party tool.

Data never sleeps
Respected Advisor
Posts: 4,173

Re: Converting documents from XLSX to CSV so that they can be read by SAS DI Studio

Posted in reply to EinarRoed

Being on SAS 9.2 without SAS/Access to PC Files licensed Linus' suggestions sounds like a good approach to me. Google some vb script which converts xlsx to csv. You then can execute this script via a SAS "x" command, "systask" or the like (using a user written node).

Then simply create an external file which defines this .csv and the rest is normal DIS "stuff".

N/A
Posts: 1

Re: Converting documents from XLSX to CSV so that they can be read by SAS DI Studio

Posted in reply to EinarRoed

I prefer java language for converting MS Office files to other formats and to convert XSLX file to CSV i prefer using Aspose Products in Java which is a cloud API so its secure and easy to use.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 4565 views
  • 0 likes
  • 7 in conversation