DATA Step, Macro, Functions and more

Importing Excel into SAS

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 128
Accepted Solution

Importing Excel into SAS

Hi all,

 

I have 24 Excel worksheets i need to import into SAS and i did so using proc import.  The problem is most of them follow the same structure just different content.  And for some reason, what's numeric on one sheet is now text on another sheet and i need everything (in this section) numeric.  Or at the very least, the same across sheets.  See screenshot below.

 

I need more control on defining the incoming variables than proc import can give me (that i know about) so i tried to set it up as an infile, but didn't get very far when i realized i couldn't figure out how to specify sheet name.

 

Thanks

Megan

 

Untitled.png


Accepted Solutions
Solution
‎06-01-2017 11:21 AM
Super User
Posts: 10,459

Re: Importing Excel into SAS

I pretty much guessed the content and issue of your question before opening the thread. We get this exact question almost daily if not multiple times per day.

 

Proc IMPORT is a guessing routine and only examines a few rows of data to "guess" the intended content. If the first few rows of data have blanks for a variable it is character, if you have codes that look numeric it may assign numeric such that other codes that contain other characters are then missing as they aren't numeric.

 

If you can't go the VBA route then pick one file. SAVE it as CSV, this is important because there are more options when importing CSV one of which is guessing rows. Set the guessingrows value very high (using the import wizard may make this easier).

When Import finishes running there will be  data step code in the log. Copy that code to the editor and save it. Look at the informat statements and see if the variable types make sense. You will see $ for character, most of the numeric will be BEST something and perhaps date, datetime, time or such.

Examine the lengths of the character variables to see if they make sense. If you have a content description document that says the AccountId field should hold 15 characters then set the number to match. If you don't have that information then you may want to increase the size by a few characters to something that makes sense. Rerun the program to check any changes you made an look at the output. Repeat until happy with the first file.

Save all of the Excel files to CSV. Change the INFILE statement to point to each and change the DATA set name to match or make sense. Run for each of the files.

Now all of your data sets will have the same characteristics.

 

Other potential problems if your data is manually entered or modified before reading is that Excel date/time/datetime fields will export funny. You may need to 1) set the display in EXCEL on the XLSX file to a common display before exporting. If there are still problems look for things entered/treated as character and not a date in the body of the spread sheet. Correct the entry to be an Excel date/time/datetime with the correct appearance.

 

Anotherr issue is that Excel will often have "phantom" data rows at the bottom of actual data and when saved as CSV you get, sometimes many, rows of nothing but commas. You may either delete all of those with an editor or add code to delete records in the data set that are incomplete.

 

Good luck.

View solution in original post


All Replies
Super User
Super User
Posts: 7,392

Re: Importing Excel into SAS

This is typical Excel issues.  You are using an unstructured datasource - Excel, and compounding the issue by using a guessing procedure - proc import. 

 

First fix your datasource - save to CSV.  You can write a very simple VBA macro to do this for all sheets, if all the data is exactly the same structure, then just loop over the sheets and output all the data to one CSV file.

 

Then write a datastep import program where you specify length, format, information for each variable (you can use the log output from proc import to help start).

Frequent Contributor
Posts: 128

Re: Importing Excel into SAS

Thanks, but i don't know how to write a vba macro.  I dont' know what vba is.  And the data is not the same, the column headers are.

 

But i have figured out how to just import everything from excel and change datatypes in sas instead.

Super User
Super User
Posts: 7,392

Re: Importing Excel into SAS

But you do know how to use Google?  VBA is a programming language within Office applications.  Is very simple and you can find loads of pre-done code.

 

You can of course carry on with the way your doing it, however the next time you get the file you may well have to do it all again - bad datasource + guessing procedure + patch it up code = unreliable.

Frequent Contributor
Posts: 128

Re: Importing Excel into SAS

I know how to google something, that's about it.

 

And even if i did take the time to find whatever that is and learn how to use it, i'd still have run that every time.  Doesnt' matter if it's before or after sas, either way it has to be manipulated.

Solution
‎06-01-2017 11:21 AM
Super User
Posts: 10,459

Re: Importing Excel into SAS

I pretty much guessed the content and issue of your question before opening the thread. We get this exact question almost daily if not multiple times per day.

 

Proc IMPORT is a guessing routine and only examines a few rows of data to "guess" the intended content. If the first few rows of data have blanks for a variable it is character, if you have codes that look numeric it may assign numeric such that other codes that contain other characters are then missing as they aren't numeric.

 

If you can't go the VBA route then pick one file. SAVE it as CSV, this is important because there are more options when importing CSV one of which is guessing rows. Set the guessingrows value very high (using the import wizard may make this easier).

When Import finishes running there will be  data step code in the log. Copy that code to the editor and save it. Look at the informat statements and see if the variable types make sense. You will see $ for character, most of the numeric will be BEST something and perhaps date, datetime, time or such.

Examine the lengths of the character variables to see if they make sense. If you have a content description document that says the AccountId field should hold 15 characters then set the number to match. If you don't have that information then you may want to increase the size by a few characters to something that makes sense. Rerun the program to check any changes you made an look at the output. Repeat until happy with the first file.

Save all of the Excel files to CSV. Change the INFILE statement to point to each and change the DATA set name to match or make sense. Run for each of the files.

Now all of your data sets will have the same characteristics.

 

Other potential problems if your data is manually entered or modified before reading is that Excel date/time/datetime fields will export funny. You may need to 1) set the display in EXCEL on the XLSX file to a common display before exporting. If there are still problems look for things entered/treated as character and not a date in the body of the spread sheet. Correct the entry to be an Excel date/time/datetime with the correct appearance.

 

Anotherr issue is that Excel will often have "phantom" data rows at the bottom of actual data and when saved as CSV you get, sometimes many, rows of nothing but commas. You may either delete all of those with an editor or add code to delete records in the data set that are incomplete.

 

Good luck.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 147 views
  • 1 like
  • 3 in conversation