BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MeganE
Pyrite | Level 9

Hi all,

 

I have 24 Excel worksheets i need to import into SAS and i did so using proc import.  The problem is most of them follow the same structure just different content.  And for some reason, what's numeric on one sheet is now text on another sheet and i need everything (in this section) numeric.  Or at the very least, the same across sheets.  See screenshot below.

 

I need more control on defining the incoming variables than proc import can give me (that i know about) so i tried to set it up as an infile, but didn't get very far when i realized i couldn't figure out how to specify sheet name.

 

Thanks

Megan

 

Untitled.png

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

I pretty much guessed the content and issue of your question before opening the thread. We get this exact question almost daily if not multiple times per day.

 

Proc IMPORT is a guessing routine and only examines a few rows of data to "guess" the intended content. If the first few rows of data have blanks for a variable it is character, if you have codes that look numeric it may assign numeric such that other codes that contain other characters are then missing as they aren't numeric.

 

If you can't go the VBA route then pick one file. SAVE it as CSV, this is important because there are more options when importing CSV one of which is guessing rows. Set the guessingrows value very high (using the import wizard may make this easier).

When Import finishes running there will be  data step code in the log. Copy that code to the editor and save it. Look at the informat statements and see if the variable types make sense. You will see $ for character, most of the numeric will be BEST something and perhaps date, datetime, time or such.

Examine the lengths of the character variables to see if they make sense. If you have a content description document that says the AccountId field should hold 15 characters then set the number to match. If you don't have that information then you may want to increase the size by a few characters to something that makes sense. Rerun the program to check any changes you made an look at the output. Repeat until happy with the first file.

Save all of the Excel files to CSV. Change the INFILE statement to point to each and change the DATA set name to match or make sense. Run for each of the files.

Now all of your data sets will have the same characteristics.

 

Other potential problems if your data is manually entered or modified before reading is that Excel date/time/datetime fields will export funny. You may need to 1) set the display in EXCEL on the XLSX file to a common display before exporting. If there are still problems look for things entered/treated as character and not a date in the body of the spread sheet. Correct the entry to be an Excel date/time/datetime with the correct appearance.

 

Anotherr issue is that Excel will often have "phantom" data rows at the bottom of actual data and when saved as CSV you get, sometimes many, rows of nothing but commas. You may either delete all of those with an editor or add code to delete records in the data set that are incomplete.

 

Good luck.

View solution in original post

5 REPLIES 5
RW9
Diamond | Level 26 RW9
Diamond | Level 26

This is typical Excel issues.  You are using an unstructured datasource - Excel, and compounding the issue by using a guessing procedure - proc import. 

 

First fix your datasource - save to CSV.  You can write a very simple VBA macro to do this for all sheets, if all the data is exactly the same structure, then just loop over the sheets and output all the data to one CSV file.

 

Then write a datastep import program where you specify length, format, information for each variable (you can use the log output from proc import to help start).

MeganE
Pyrite | Level 9

Thanks, but i don't know how to write a vba macro.  I dont' know what vba is.  And the data is not the same, the column headers are.

 

But i have figured out how to just import everything from excel and change datatypes in sas instead.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

But you do know how to use Google?  VBA is a programming language within Office applications.  Is very simple and you can find loads of pre-done code.

 

You can of course carry on with the way your doing it, however the next time you get the file you may well have to do it all again - bad datasource + guessing procedure + patch it up code = unreliable.

MeganE
Pyrite | Level 9

I know how to google something, that's about it.

 

And even if i did take the time to find whatever that is and learn how to use it, i'd still have run that every time.  Doesnt' matter if it's before or after sas, either way it has to be manipulated.

ballardw
Super User

I pretty much guessed the content and issue of your question before opening the thread. We get this exact question almost daily if not multiple times per day.

 

Proc IMPORT is a guessing routine and only examines a few rows of data to "guess" the intended content. If the first few rows of data have blanks for a variable it is character, if you have codes that look numeric it may assign numeric such that other codes that contain other characters are then missing as they aren't numeric.

 

If you can't go the VBA route then pick one file. SAVE it as CSV, this is important because there are more options when importing CSV one of which is guessing rows. Set the guessingrows value very high (using the import wizard may make this easier).

When Import finishes running there will be  data step code in the log. Copy that code to the editor and save it. Look at the informat statements and see if the variable types make sense. You will see $ for character, most of the numeric will be BEST something and perhaps date, datetime, time or such.

Examine the lengths of the character variables to see if they make sense. If you have a content description document that says the AccountId field should hold 15 characters then set the number to match. If you don't have that information then you may want to increase the size by a few characters to something that makes sense. Rerun the program to check any changes you made an look at the output. Repeat until happy with the first file.

Save all of the Excel files to CSV. Change the INFILE statement to point to each and change the DATA set name to match or make sense. Run for each of the files.

Now all of your data sets will have the same characteristics.

 

Other potential problems if your data is manually entered or modified before reading is that Excel date/time/datetime fields will export funny. You may need to 1) set the display in EXCEL on the XLSX file to a common display before exporting. If there are still problems look for things entered/treated as character and not a date in the body of the spread sheet. Correct the entry to be an Excel date/time/datetime with the correct appearance.

 

Anotherr issue is that Excel will often have "phantom" data rows at the bottom of actual data and when saved as CSV you get, sometimes many, rows of nothing but commas. You may either delete all of those with an editor or add code to delete records in the data set that are incomplete.

 

Good luck.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1268 views
  • 1 like
  • 3 in conversation