I want to transfer a table from Excel to SAS (My SAS version is 9.2 and Excel file format is XLSM, macro). The column names will be read from the cell B3 and the data will start from the cell B4, like below:
A B C D E F G ...
1
2
3 Col1 Col2
4 15 20
5 16 21
6 ... ...
The problem is that the last row number is unknown, because the table length can be 200 rows today and it can be 350 rows tomorrow. So how can I import this table from Excel (XLSM) to SAS-table?
I read in somwhere that we can use DATAROW
in Proc Import
when DBMS=EXCEL
like below:
proc import datafile = "!datafile" out=Table1 DBMS = EXCEL REPLACE;
SHEET = "Sheet1"; GETNAMES=YES; MIXED=YES; USEDATE=YES; SCANTIME=YES; NAMEROW=3; DATAROW=4;
run;
However, SAS can not recognize the DATAROW
option, giving the error ERROR 180-322: Statement is not valid or it is used out of proper order.
. There is another way of importing table from Excel like:
PROC SQL;
CONNECT TO EXCEL (PATH='C:\\thepath\excelfile.xlsm');
Create Table Table1 as SELECT * FROM CONNECTION TO EXCEL
(SELECT * FROM [Sheet1$]);
DISCONNECT FROM EXCEL;
QUIT;
As a result, does anyone know how to export a table with Unknown rows from XLSM to SAS? Thanks in advance...
To be honest going from Excel (unstrcutured mess) to SAS (structured format) is going to be difficult. You could try specifyfing the range as A:Z and drop blank records. However, do the columns all have the same format in each observations etc. Why can you not just remove those two records from the start. If it was me I would write a small VBA macro which just runs over the range, i.e. A4: xlRight/xlDown, and loop over the data writing out a CSV file.
http://stackoverflow.com/questions/22618513/vba-looping-through-range-writing-to-csv
Then you can simply import the CSV into SAS using datastep and infile.
First of all, save the file in a suitable file transfer format (which xlsm is NOT), ie csv.
The DATAROW statement is valid only for delimited files, so it cannot be used with crappy formats like the xlsX family.
(This is explicitly explained in the PROC IMPORT documentation)
I found an "ineffective" alternative solution which reads all possible rows in Excel (reads 50.000 rows), at the same time it checks every row under the column Col1
if these rows have a value.
It takes 7-8 seconds, and it works. But as I wrote, it feels ineffective to read the whole 50.000 rows.
PROC SQL;
CONNECT TO EXCEL (PATH='C:\\thepath\excelfile.xlsm');
Create Table Table1 as SELECT * FROM CONNECTION TO EXCEL
(SELECT * FROM [Sheet1$B3:C50000] WHERE Col1 IS NOT NULL);
DISCONNECT FROM EXCEL;
QUIT;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.