DATA Step, Macro, Functions and more

Importing Data Sets with Same Format

Reply
Contributor
Posts: 28

Importing Data Sets with Same Format

Using SAS Enterprise Guide

 

I have 22 excel data sets that all include the same variables in the same order. These datasets are broken down 2016 weeks 1-5, 2016 weeks 6-10, 2016 weeks 11-15, etc..... I am using "Import Data" in SAS Enterprise Guide to import all 22 files, and then would like to append all 22 so I have one long dataset. However, when importing the files some of the variables by default have different formats. In dataset 1 , "Volume" has [TYPE=NUMBER, SOURCE INFORMAT=BEST12, LENGTH=8] however in dataset 2, "Volume" has [TYPE=NUMBER, SOURCE INFORMAT=BEST12, LENGTH=10]. Since the variable volume has different formats, I am unable to append the datasets.

 

This is a simplified example as I have approx. 90 variables in each of the 22 datasets. Around 20 or so of the variables are having the issue of different formats.

 

Is there any way to have the datasets all upload with the same formats? The only solution I know would be to individually change the formats in the "Determine Field Attributes" during the upload process which would be very time consuming.

 

 

Super User
Posts: 9,610

Re: Importing Data Sets with Same Format

Posted in reply to Amber_Nicole94

Save the data from Excel to a format that gives you more control. With Excel files, SAS import has to make guesses about the structure (keep in mind that Excel has no column-oriented structure at all), and that leads to different results for different files.

If you export to CSV, you can read the data with a data step that will be the same for all files (therefore the datasets will all be similar), and appending will be a breeze.

 

Bottom line: do not use the native Excel file format for data interchange.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 28

Re: Importing Data Sets with Same Format

Posted in reply to KurtBremser

My excel files are coming from my company data base website. They are downloaded each Monday morning with the new weeks data at 3am. The scheduled exports only option is to export as an excel file and does not have a CSV option.

 

How would I import all files with one step?

Super User
Posts: 9,610

Re: Importing Data Sets with Same Format

Posted in reply to Amber_Nicole94

Amber_Nicole94 wrote:

My excel files are coming from my company data base website. They are downloaded each Monday morning with the new weeks data at 3am. The scheduled exports only option is to export as an excel file and does not have a CSV option.

 

How would I import all files with one step?


You can't. Each Excel file needs to be handled individually.

 

I repeat: the Excel format is utterly unusable for data interchange. Use something sensible that is text-based.

 

If some brain-dead moron insists on being stupid, you'll have to work through correcting data after every import to align the datasets. Or you could try scripting a "open/save as csv" process with Excel via a VBA script. Or use some better software (OpenOffice eg) that can do such conversions via commandline.

In all cases, charge your extra time. Maybe dollars will make them realize their stupidity.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super User
Posts: 13,073

Re: Importing Data Sets with Same Format

Posted in reply to KurtBremser

KurtBremser wrote:

Amber_Nicole94 wrote:

My excel files are coming from my company data base website. They are downloaded each Monday morning with the new weeks data at 3am. The scheduled exports only option is to export as an excel file and does not have a CSV option.

 

How would I import all files with one step?


You can't. Each Excel file needs to be handled individually.

 

I repeat: the Excel format is utterly unusable for data interchange. Use something sensible that is text-based.

 

If some brain-dead moron insists on being stupid, you'll have to work through correcting data after every import to align the datasets. Or you could try scripting a "open/save as csv" process with Excel via a VBA script. Or use some better software (OpenOffice eg) that can do such conversions via commandline.

In all cases, charge your extra time. Maybe dollars will make them realize their stupidity.


One thing I would add to @KurtBremser's approach: Try opening the file with a text editor such as the SAS program editor. There is a chance, which may be small, that the database is actually exporting CSV but the program names the file with an xls or xlsx extension. If that works then treat them as CSV files with poor names.

Super User
Super User
Posts: 9,227

Re: Importing Data Sets with Same Format

Posted in reply to Amber_Nicole94

If you have a company database, why not directly access that database and get your data in a usable format?  What you are doing now is taking nice structured data, dumping it out of the nice structured format into Excel which does its own thing, then trying to import that back to a nice structured format.  Skip the whole step and go direct to the database using ODBC or something like that and SQL drag out the data you require.

Super User
Posts: 22,865

Re: Importing Data Sets with Same Format

Posted in reply to Amber_Nicole94

You have two options here that I see. 

 

1. Import the Excel file as is and develop (manual) program that will check each variable type and if not then it converts it to the type and format desired. 

2. Convert the Excel files to CSV - using a VBS script, see link below, and then use a standard data import to import all at once. You could also automate this entire procedure if so inclined, to download the file, convert and then import. 

http://support.sas.com/kb/43/496.html

 

Personally, I'd go with Option 2 because I'm lazy...but you can do Option 1. It'll likely end up being a macro and you can use VTYPE to check a variables type. If you do decide to go with Option 1, I would recommend first creating a file with the master variable list and types and use that to drive your program so it's still automated in a fashion. 

Trusted Advisor
Posts: 1,826

Re: Importing Data Sets with Same Format

Posted in reply to Amber_Nicole94

Use proc import with one of the datasets.

Copy the generated code from the log to the program editor.

Change types as prffered.

 

You have not mentioned what are the excel filenames, but assuming tha all of them

have the same format, same prefix <path-prefix-sufix.xlsx> you can wrap the code

by a macro and rerun supplying only the suffix each time:

%macro import_xl(suffix);
    ... enter the code here using  ...
    "<path-prefix>&suffix..xlsx>"
    ...
;
proc append data=<imported&suffix) base=... force; run;
%mend;
%import_xl(<suffix1>);
%import_xl(<suffix2>);
...
%import_xl(suffix22>);

data want;
  set <imported1> .... <imported22>;
run;

 

Super User
Posts: 9,610

Re: Importing Data Sets with Same Format

AFAIK, proc import does not create a data step for Excel files, as they are not text files. So there is nothing to copy from the log.

One might find a data step when using the EG import wizard, as EG converts to text on its own and lets SAS read the resulting text file.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Ask a Question
Discussion stats
  • 8 replies
  • 102 views
  • 4 likes
  • 6 in conversation