One of the critical tasks in SAS is to take data from an outside source and bring it into the environment for analysis. One of the most ubiquitous sources for data is, of course, Excel, so I decided to use it as the example for this “How to Import” article.
First and foremost, the Excel file must be in XLS (97 – 2004 file format) – the new XLSX extension won’t import with the automatically generated SAS code. In the event that your spreadsheet is in XLSX format, you could change the DBMS= to XLXS and change the extension on the file to XLSX and it works fine.
Secondly, please read the documentation for the virtual machine / operating system – setting up the folder will literally “make or break” your ability to import data.
One of the features I’m extremely fond of in SAS University Edition are the “Snippets” (see https://communities.sas.com/docs/DOC-10596 for more information). Using the IMPORT XLS snippet, and putting in the path to my file, this is what I end up with.
Easy and straightforward, and for a regular Excel file without anything unusual, this will work.
Here are the results as outputted by SAS:
But we all know that a data file is rarely perfect. I’m going to show you a couple of tricks to get around some oddball stuff I’ve encountered.
Here’s a screenshot of the data (modified from the previous example’s data):
Note that I’ve added line of text on the first row. When I import this using the previous code, SAS has done what I asked – imported the complete dataset and printed it out. It used the text from the first cell as the name of the first variable, and it generated names for the other rows, since the corresponding cells were blank in row 1. Obviously this won’t work, and I need to figure out a way to import just the data.
Using the DATAROW statement, which allows me to specify what row the data starts on, seems promising.
OK, I’m getting really close – but still not quite. I need some way to have SAS Studio assign valid names to the variables, instead of using the column headings from the worksheet.
Here’s the code, using the GETNAMES=NO statement.
That does it – the data is imported, and we can now get down to the business of our analysis!