BookmarkSubscribeRSS Feed
Amber_Nicole94
Obsidian | Level 7

Using SAS Enterprise Guide

 

I have 22 excel data sets that all include the same variables in the same order. These datasets are broken down 2016 weeks 1-5, 2016 weeks 6-10, 2016 weeks 11-15, etc..... I am using "Import Data" in SAS Enterprise Guide to import all 22 files, and then would like to append all 22 so I have one long dataset. However, when importing the files some of the variables by default have different formats. In dataset 1 , "Volume" has [TYPE=NUMBER, SOURCE INFORMAT=BEST12, LENGTH=8] however in dataset 2, "Volume" has [TYPE=NUMBER, SOURCE INFORMAT=BEST12, LENGTH=10]. Since the variable volume has different formats, I am unable to append the datasets.

 

This is a simplified example as I have approx. 90 variables in each of the 22 datasets. Around 20 or so of the variables are having the issue of different formats.

 

Is there any way to have the datasets all upload with the same formats? The only solution I know would be to individually change the formats in the "Determine Field Attributes" during the upload process which would be very time consuming.

 

 

8 REPLIES 8
Kurt_Bremser
Super User

Save the data from Excel to a format that gives you more control. With Excel files, SAS import has to make guesses about the structure (keep in mind that Excel has no column-oriented structure at all), and that leads to different results for different files.

If you export to CSV, you can read the data with a data step that will be the same for all files (therefore the datasets will all be similar), and appending will be a breeze.

 

Bottom line: do not use the native Excel file format for data interchange.

Amber_Nicole94
Obsidian | Level 7

My excel files are coming from my company data base website. They are downloaded each Monday morning with the new weeks data at 3am. The scheduled exports only option is to export as an excel file and does not have a CSV option.

 

How would I import all files with one step?

Kurt_Bremser
Super User

@Amber_Nicole94 wrote:

My excel files are coming from my company data base website. They are downloaded each Monday morning with the new weeks data at 3am. The scheduled exports only option is to export as an excel file and does not have a CSV option.

 

How would I import all files with one step?


You can't. Each Excel file needs to be handled individually.

 

I repeat: the Excel format is utterly unusable for data interchange. Use something sensible that is text-based.

 

If some brain-dead moron insists on being stupid, you'll have to work through correcting data after every import to align the datasets. Or you could try scripting a "open/save as csv" process with Excel via a VBA script. Or use some better software (OpenOffice eg) that can do such conversions via commandline.

In all cases, charge your extra time. Maybe dollars will make them realize their stupidity.

ballardw
Super User

@Kurt_Bremser wrote:

@Amber_Nicole94 wrote:

My excel files are coming from my company data base website. They are downloaded each Monday morning with the new weeks data at 3am. The scheduled exports only option is to export as an excel file and does not have a CSV option.

 

How would I import all files with one step?


You can't. Each Excel file needs to be handled individually.

 

I repeat: the Excel format is utterly unusable for data interchange. Use something sensible that is text-based.

 

If some brain-dead moron insists on being stupid, you'll have to work through correcting data after every import to align the datasets. Or you could try scripting a "open/save as csv" process with Excel via a VBA script. Or use some better software (OpenOffice eg) that can do such conversions via commandline.

In all cases, charge your extra time. Maybe dollars will make them realize their stupidity.


One thing I would add to @Kurt_Bremser's approach: Try opening the file with a text editor such as the SAS program editor. There is a chance, which may be small, that the database is actually exporting CSV but the program names the file with an xls or xlsx extension. If that works then treat them as CSV files with poor names.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

If you have a company database, why not directly access that database and get your data in a usable format?  What you are doing now is taking nice structured data, dumping it out of the nice structured format into Excel which does its own thing, then trying to import that back to a nice structured format.  Skip the whole step and go direct to the database using ODBC or something like that and SQL drag out the data you require.

Reeza
Super User

You have two options here that I see. 

 

1. Import the Excel file as is and develop (manual) program that will check each variable type and if not then it converts it to the type and format desired. 

2. Convert the Excel files to CSV - using a VBS script, see link below, and then use a standard data import to import all at once. You could also automate this entire procedure if so inclined, to download the file, convert and then import. 

http://support.sas.com/kb/43/496.html

 

Personally, I'd go with Option 2 because I'm lazy...but you can do Option 1. It'll likely end up being a macro and you can use VTYPE to check a variables type. If you do decide to go with Option 1, I would recommend first creating a file with the master variable list and types and use that to drive your program so it's still automated in a fashion. 

Shmuel
Garnet | Level 18

Use proc import with one of the datasets.

Copy the generated code from the log to the program editor.

Change types as prffered.

 

You have not mentioned what are the excel filenames, but assuming tha all of them

have the same format, same prefix <path-prefix-sufix.xlsx> you can wrap the code

by a macro and rerun supplying only the suffix each time:

%macro import_xl(suffix);
    ... enter the code here using  ...
    "<path-prefix>&suffix..xlsx>"
    ...
;
proc append data=<imported&suffix) base=... force; run;
%mend;
%import_xl(<suffix1>);
%import_xl(<suffix2>);
...
%import_xl(suffix22>);

data want;
  set <imported1> .... <imported22>;
run;

 

Kurt_Bremser
Super User

AFAIK, proc import does not create a data step for Excel files, as they are not text files. So there is nothing to copy from the log.

One might find a data step when using the EG import wizard, as EG converts to text on its own and lets SAS read the resulting text file.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1360 views
  • 4 likes
  • 6 in conversation