BookmarkSubscribeRSS Feed
marksanter
Fluorite | Level 6

Good afternoon,

 

I am currently working with several hundred excel spreadsheets in a folder. They currently have the variables Height and Width. The IDs and Time variables are in the excel file names. but not in the spreadsheets themselves. I am looking to import all of the files into one dataset with the variables ID, Time, Height, and Width. How can I go about importing them into one dataset with the headings, and using the file names to create variables? The first four characters of the file name would create the ID variable and the fifth would create the time variable. I have attached example datafiles and the final dataset I am hoping to achieve. 

 

I am looking for guidance on how to achieve this.

 

Thank you for your help.

 

Mark 

3 REPLIES 3
Reeza
Super User

This can start you off:

https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=mcrolref&docsetTarget=n0c...

 

This part should change:

proc import datafile="&dir\%qsysfunc(dread(&did,&i))" out=dsn&cnt 
             dbms=csv replace;            
          run;      

To this, which add the file name into the file. 

proc import datafile="&dir\%qsysfunc(dread(&did,&i))" out=dsn&cnt 
             dbms=XLSX replace;            
          run;      

data dsn&cnt;
set dsn&cnt;

*creates file name;
sourceFile = "%qsysfunc(dread(&did,&i))";

*insert rest of functions from yesterdays post;

run;

 

EDIT: I think you also need to change it slightly to check for XLSX not CSV files but that should be fairly straightforward.

 

Then once the macro completes, this will append all the files together (any data set named with the prefix DSN)

data bigFile;
set dsn:;
run;

 

However, if your Excel files are not formatted consistently one common issue is that it will not read the types correctly. In that case you'll get errors about a mismatch of types (some are character, some are numeric). If that happens you'll need to either rejig your process to ensure your data types are correct for each file or convert them all to CSV and import those which do allow you to specify the types as you import the data.

 

This code converts all XLSX files in a folder to CSV

https://gist.github.com/statgeek/878e585102c14e01581f55dbe972d27e

 


@marksanter wrote:

Good afternoon,

 

I am currently working with several hundred excel spreadsheets in a folder. They currently have the variables Height and Width. The IDs and Time variables are in the excel file names. but not in the spreadsheets themselves. I am looking to import all of the files into one dataset with the variables ID, Time, Height, and Width. How can I go about importing them into one dataset with the headings, and using the file names to create variables? The first four characters of the file name would create the ID variable and the fifth would create the time variable. I have attached example datafiles and the final dataset I am hoping to achieve. 

 

I am looking for guidance on how to achieve this.

 

Thank you for your help.

 

Mark 


 

marksanter
Fluorite | Level 6

Hello, 

 

Thank you for your help with this. I have tried to run the code and it is telling me that Work.dsn does not exist. Everything appears to be okay though. Any thoughts?

 

Mark

Kurt_Bremser
Super User

If those files were stored in csv or another text format, you would need only one data step:

data final;
length
  fname f_name $200
  id $4
  time $1
;
infile "/path/*.csv" filename=fname dlm="," dsd truncover;
f_name = scan(fname,-1,"/"); /* extract filename from complete path */
id = substr(f_name,1,4);
time = substr(f_name,5,1);
input @;
if index(upcase(_infile_),'HEIGHT') = 0; /* discard header */
input height width;
drop f_name;
run;

Untested.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 8415 views
  • 0 likes
  • 3 in conversation