BookmarkSubscribeRSS Feed
K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

I have excel files with single observations (one row + the row with the column titles) come in. The excels all have the same number of columns and variable names - i.e. exact same structure.

Each excel's file name is an ID code.

After importing said excels,  I will stack them and build a database. The problem is that the ID code is not contained within the excel itself...it only appears in the file name.

I would like to extract the ID code from the file title and add it as a variable entry.

 

Let say I have an excel that is called 20210123.  I would like to grab this number and enter it into the database as the IDCode variable.

 

How would you go about coding this?

5 REPLIES 5
Reeza
Super User

1. Get list of excel files

Now for each file:

2. Import Excel file

3. Add column to Excel file with file name 

4. Append to master table

 

Or

 

1. Get list of Excel files

2. Import each Excel file and add name (2/3)

3. Append all data sets together. 

 

First two macros here can get you started.

https://communities.sas.com/t5/SAS-Communities-Library/SAS-9-4-Macro-Language-Reference-Has-a-New-Ap...

 

 

@K_S wrote:

I have excel files with single observations (one row + the row with the column titles) come in. The excels all have the same number of columns and variable names - i.e. exact same structure.

Each excel's file name is an ID code.

After importing said excels,  I will stack them and build a database. The problem is that the ID code is not contained within the excel itself...it only appears in the file name.

I would like to extract the ID code from the file title and add it as a variable entry.

 

Let say I have an excel that is called 20210123.  I would like to grab this number and enter it into the database as the IDCode variable.

 

How would you go about coding this?


 
K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

is there a way to use the name of the file in the set statement as a variable entry

 new_var=name of datasetname in set statement..

Thanks for the documentation you sent. A bit too advanced for me so it is pretty intimidating.

Reeza
Super User

Then go through these ones first. 

Yes, it's not a particularly difficult task you're trying to do but it automating something which is a bit of a different skill set. 

You could also search on here, I know this question has been asked and answered many times.  It depends on if you're trying to solve a problem or if you care if you understand how you solve it to be able to reuse the code in the future.

 

UCLA introductory tutorial on macro variables and macros
https://stats.idre.ucla.edu/sas/seminars/sas-macros-introduction/

Tutorial on converting a working program to a macro
This method is pretty robust and helps prevent errors and makes it much easier to debug your code. Obviously biased, because I wrote it 🙂 https://github.com/statgeek/SAS-Tutorials/blob/master/Turning%20a%20program%20into%20a%20macro.md

Examples of common macro usage
https://communities.sas.com/t5/SAS-Communities-Library/SAS-9-4-Macro-Language-Reference-Has-a-New-Ap...

 

 

yabwon
Onyx | Level 15

Do you want to gel something like this:

/* test data */

%let folder = %sysfunc(pathname(WORK))/test;

options dlcreatedir;
libname test "&folder.";

%macro genData();
%do i = 1 %to 5;
proc export 
  data = sashelp.class(firstobs=&i. obs = &i.)
  outfile = "&folder./&i..xlsx"
  dbms=xlsx replace
;
run;
%end;
%mend;

%genData()



/* get Excels names */
data listOfFiles;
  base = "&folder.";
  length file $ 256;

  folderRef = "_%sysfunc(datetime(), hex6.)0";

  rc = filename(folderRef, base);
  folderid = dopen(folderRef);

  do i=1 to dnum(folderId); drop i;
    file = dread(folderId, i);
    if upcase(scan(file, -1, ".")) = "XLSX" then output;
  end;

  rc = filename(folderRef);
keep file;
run;
proc sort data = listOfFiles;
  by file;
run;
proc print data = listOfFiles;
run;

?

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Tom
Super User Tom
Super User

What types of files are they? Modern XLSX files?  Older XLS files?

Or are they not Excel files at all.  Perhaps they are instead CSV files and it is just that your computer has been instructed to open them in Excel by default if not told to do something different.

 

If they are CSV files then it is MUCH easier.

 

So let's think about how to deal with a set of actual Excel files, either XLS or XLSX.  One way to make a consistent data structure might be to first convert the files into a single CSV file and then read that.

 

First get a list of the files.  Easiest way is to use your operating system command for listing files.

data files;
  infile 'ls /my-directory/*.xlsx' pipe truncover ;
  input filename $256.;
run;

Now that you have the list of files you can use it to generate code to import each file and write its line(s) of data to a CSV file.

So first generate the code.

filename code temp;
data _null_;
  set files ;
  file code;
  length idcode $30 ;
  idcode = scan(filename,-2,'./\');
  put 'proc import dbms=xlsx out=next replace datafile=' filename :$quote. ';'
    / 'run;'
    / 'data _null_;'
    / '  idcode=' idcode :$quote. ';'
    / '  file csv dsd mod;'
    / '  set next;'
    / '  put (_all_) (+0);'
    / 'run;'
  ;
run;

Then point CSV at the file you want to use to store the new text file with all of the data and run the generated code.

filename csv temp;
%include code / source2;

Now that you have all of the data in the a single CSV file you can read in the data with a data step.  That way you can control how each of the variables is defined.

data want;
  infile csv dsd truncover ;
* Define the length and order of all of your variables ;
* remember numeric variables use length 8 because they are 64 bit floating point values;
  length idcode $30 .....  all of your other variables .... ;
  input (_all_) (+0);
  * add any FORMAT or INFORMAT statements you might need ;
  * add any LABEL statements you might need ;
run;

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1595 views
  • 0 likes
  • 4 in conversation