For the purpose of having control over how my data is imported (specifying formats and data types), I have typically used the following form:
data DSN;
%let _EFIERR_=0;
infile "&inputpath.\&filename..csv" delimiter=',' MISSOVER DSD lrecl=32767 firstobs=2;
informat &infmtstatement.;
format &fmtstatement.;
input &inputstatement.;
label &labelstatement.;
if _ERROR_ then call symputx('_EFIERR_',1);
run;
Recently, I have started work at a new company that strictly uses PROC IMPORT (see below). This has worked great for me until recently. I have noticed that sometimes the data is imported in different variable types and formats than I want.
PROC IMPORT OUT=DSN DATAFILE= "&inputpath\&filename..xlsx" DBMS=EXCELCS REPLACE;
RANGE="Check 2 Visit Windows$";
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
My old method works great for importing CSV files when I want to specify data types and formats. My new method works great if I want to read in XLSX files with multiple worksheets. Is there a way I can combine the two methods such that I can read in XLSX files with multiple worksheets and still specify the variable types and formats?
Only if you use the EXCELCS database engine. With that database engine you can use the DBDSOPTS setting to pass in settings for the DBTYPE option.
DBDSOPTS= "DBTYPE=(BilledTo='CHAR(8)')";
BUT to use that database engie either your SAS is running on Windows and you have a compatible version of Excel that it can call or you are using a SAS PCFILES server that is running compatilble version of Excel.
Normally it is not a big issue with XLSX files since SAS can read the metadata on the cells an compute the data type. Unlike a CSV file where the only available metadata is the column heading. But if you are trying to read multiple fies that SHOULD have similar structure you might still get some characters variables that have different lengths defined based on the data that is in the sheet. In that case you might want to add a step after your PROC IMPORT to impose your standard structure. It could look similar to your data step code you posted. I would add step to REMOVE formats that PROC IMPORT might have attached to variables and then just add back on the formats for the variables, like dates and times, that need them.
PROC IMPORT OUT=DSN_temp DATAFILE= "&inputpath\&filename..xlsx" DBMS=XLSX REPLACE;
RANGE="Check 2 Visit Windows$";
RUN;
%let varlist=Id 8 Name $50 Age 8 DOB 8 ;
%let formatlist= dob date9.;
data dsn;
length &varlist ;
set dsn_temp;
format _all_;
format &formatlist;
run;
At this web page, scroll down to the DBDSOPTS option
An example is given where a column is forced to be char(8), or character variable with length 8.
Only if you use the EXCELCS database engine. With that database engine you can use the DBDSOPTS setting to pass in settings for the DBTYPE option.
DBDSOPTS= "DBTYPE=(BilledTo='CHAR(8)')";
BUT to use that database engie either your SAS is running on Windows and you have a compatible version of Excel that it can call or you are using a SAS PCFILES server that is running compatilble version of Excel.
Normally it is not a big issue with XLSX files since SAS can read the metadata on the cells an compute the data type. Unlike a CSV file where the only available metadata is the column heading. But if you are trying to read multiple fies that SHOULD have similar structure you might still get some characters variables that have different lengths defined based on the data that is in the sheet. In that case you might want to add a step after your PROC IMPORT to impose your standard structure. It could look similar to your data step code you posted. I would add step to REMOVE formats that PROC IMPORT might have attached to variables and then just add back on the formats for the variables, like dates and times, that need them.
PROC IMPORT OUT=DSN_temp DATAFILE= "&inputpath\&filename..xlsx" DBMS=XLSX REPLACE;
RANGE="Check 2 Visit Windows$";
RUN;
%let varlist=Id 8 Name $50 Age 8 DOB 8 ;
%let formatlist= dob date9.;
data dsn;
length &varlist ;
set dsn_temp;
format _all_;
format &formatlist;
run;
@Tom wrote:
Only if you use the EXCELCS database engine.
The page I linked to disagrees with you. It says it works with EXCEL database engine, and my own personal experience confirms that it works with EXCEL.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.