Hello,
I'm using PROC IMPORT to read 100 xls files:
PROC IMPORT OUT = Proc_Import_Data
datafile = "&Path.Excel_1.xls"
dbms = XLS replace;
sheet = "Sheet1";
getnames = yes;
GUESSINGROWS=1000;
RUN;
In the Excel spreadsheet, however, there is a column with a time stamp that is not read correctly.
E.g.:
Excel | SAS |
11:00 | 1029083 |
15:25 | 1029063 |
13:26 | 1028989 |
Is there a solution to this problem?
If you have XLSX, and 9.4, then use libname xslx:
https://blogs.sas.com/content/sasdummy/2015/05/20/using-libname-xlsx-to-read-and-write-excel-files/
With that you can proc copy all sheets from it. Simpler. Its still not an ideal method however. XSLX is still a poor format, and you are still relying on guessing procedures to get the data. You need to ask some questions, to start with how similar is the data. If its all the same, then its good, you can create one output file and one import program. You can export the data pretty simply from Excel using a vba macro to loop over files and sheets, and write out data to csv - one big datafile, then a datastep to import.
https://trumpexcel.com/list-of-file-names-from-a-folder-in-excel/
https://www.extendoffice.com/documents/excel/2980-excel-save-export-each-sheet-as-csv.html
From a simple search.
That would make your life simple.
If the data is all different, what exactly are you aiming to do with this? It may be that you need processing, pre and post import to even work with it.
Yes, you probably wont like it though.
1) Don't use XLS - it is decades out of support, MS moved to Open Office format back in 2003, so a really bad binary file format shouldn't used in this day and age.
2) Don't use proc import, it is a guessing procedure, it looks at the data and tried to guess what it should do with it, considering Excel has no fixed typing and is unstructured, when guessing you are very likely to get garbage data out at the end.
The solution, save all the xls files to a nice, open, portable file format like CSV, then write one datastep import program which will read all the csv files (assuming they are the same or similar) using the formats and lengths that you - the person who knows the data - tells it to read them as. Avoids the whole guessing part. You could try fixing the cell you have issues with (maybe it has fractions of a second or something), but you will definately have further issues continuing in such a way.
Thank you for your answer. The problem is that there are multiple sheets in each Excel file. So I would have to generate 100 * 8 CSV files.
The data is actually in XLSX. I just saved it as XLS so I can use the GUESSINGROWS option.
If you continue to wade knee-deep in the excrements, get used to the stink.
As long as you use the Excel ****, you have to live with the consequences. There is a reason why all long-timers here view Excel files as unusable you-know-what.
If you have XLSX, and 9.4, then use libname xslx:
https://blogs.sas.com/content/sasdummy/2015/05/20/using-libname-xlsx-to-read-and-write-excel-files/
With that you can proc copy all sheets from it. Simpler. Its still not an ideal method however. XSLX is still a poor format, and you are still relying on guessing procedures to get the data. You need to ask some questions, to start with how similar is the data. If its all the same, then its good, you can create one output file and one import program. You can export the data pretty simply from Excel using a vba macro to loop over files and sheets, and write out data to csv - one big datafile, then a datastep to import.
https://trumpexcel.com/list-of-file-names-from-a-folder-in-excel/
https://www.extendoffice.com/documents/excel/2980-excel-save-export-each-sheet-as-csv.html
From a simple search.
That would make your life simple.
If the data is all different, what exactly are you aiming to do with this? It may be that you need processing, pre and post import to even work with it.
Thanks.
I used libname before but then I got problems with missing values. Some columns only have a date after the 8 line and only missings before. These columns are then read in as a string and not a date.
Then I thought that I found the solution with dbsastype (= 'DATE'). But unfortunately it does not work on the date (dd.mm.yyyy).
So I'll probably have to take the path over a vba macro.
Its hard to say without seeing the file. It sounds like you have a typical mess in an Excel file and are trying to shoehorn it into a proper structured dataset. I.e. you mention it starts on row 8. A pre-process step to get it into a more structured format may well help you in other aspects as well. You are getting to see why I really don't like Excel in any way as a data storage, even as an output format its not high on my list.
Does the entire column in the Excel file use the same datatype? If not then SAS will have to guess which type to use. If you have mixed character and numeric values then SAS will create a character variable and date/time values will appear as digit strings that represent the underlying number that Excel uses to store that value. I have seen it do this for date values and assume it will do the same thing when the numbers represent time values instead.
Is your column in SAS numeric or character?
If it is character if you change the display format attached to the values in the Excel file to general do you see the same digit strings that you see in the SAS dataset?
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.