BookmarkSubscribeRSS Feed
flyingsohigh
Calcite | Level 5
I have 300 excel files with file names "Summary File - xxx.xls". xxx can be any of city names, like Cary, etc. However, I am not sure all the city names in the file exactly. I need to read in all the city names from the filenames first, then read all the records into a sas data set. Any one knows how to do this? Thanks.
10 REPLIES 10
DanielSantos
Barite | Level 11
Hello.

Using the appropriate SAS functions (it will work on any system):

data FILENAME_LIST;
length FILENAME $200;
drop _:; /* drop temporary variables */
_RC=filename('dirref',"D:\Temp"); /* assign dir ref */
_DID=dopen('dirref'); /* open dir */
_NUM_OF_FILES=dnum(_DID); /* get total number of members */
do _I=1 to _NUM_OF_FILES; /* cycle through members /*
FILENAME=dread(_DID,_I); /* read member name */
output;
end;
_RC=dclose(_DID);
_RC=filename('dirref');
run;

FILENAME_LIST will contain a row per file in the specified directory.

Another way could be piping the result of a list command (system dependent) into a dataset.

Something like that (Windows example based on dir command):

filename FILELIST pipe "dir /b";

data FILENAME_LIST;
length FILENAME $200;
infile FILELIST;
input;
FILENAME=_infile_;
run;

From here you just have to select the desired filenames (say with a approriate wildcard), assign those to macro variables and use the latest on your SAS script.

Greetings from Portugal.

Daniel Santos at www.cgd.pt

Message was edited by: Daniel Santos
flyingsohigh
Calcite | Level 5
The pipe function works beautifully. Thanks for your suggestion.
Norman21
Lapis Lazuli | Level 10
Hi, I have a similar problem, and this is my first post - so please bear with me!

I have an application that writes data to identically-named (and formatted) Excel files in various subdirectories. I'd like to read the data from each, and merge the whole lot into one large file for subsequent analysis.

The files are named as follows (simplified):

n:\data\data01\excelfile.xls
n:\data\data02\excelfile.xls
n:\data\data03\excelfile.xls
... (more than 200)

I've put all the filenames into a separate text file, then taken a tip from Cody's book as follows:

data readmany;
infile 'C:\My SAS Files\filelist.txt';
input externalnames $ 80.;
infile dummy filevar=externalnames end=last;
do until (last);
input...?;
output;
end;
run;
proc contents data=readmany;
run;

I'm not sure what to put after the input statement to make this work. Also, is there another way to tackle this problem? I'll get new Excel files from time to time, so will need to re-build the master file periodically.

I hope that's clear!

Oh, I should add I'm running SAS 9.2 in Windows XP Pro.
Norman.
SAS 9.4 (TS1M6) X64_10PRO WIN 10.0.17763 Workstation

DanielSantos
Barite | Level 11
Hello Norman.

From there its actually dependent on the way you are importing the data.

I suppose those are standard excel files, and that you have proc import licensed and that you are using it to import the desired files.

If so, you'll need to issue a proc import for each file.

I think a possible way to code this, would be to load the file list into a list of macro variables...
[pre]
%let FILELISTN=0; * init file list count;
data _null_;
infile 'C:\My SAS Files\filelist.txt';
input EXTERNALNAMES $ 80.;
* create macro variable FILELISTX with the corresponding filename;
call symput(cats('FILELIST',put(_N_,best.))),catx(' ',symget('FILELIST'),EXTERNALNAMES);
* save count of FILELISTX created;
call symput('FILELISTN', put(_N_,best.));
run;
[/pre]
After that you'll get,[pre]
FILELIST1=n:\data\data01\excelfile.xls
FILELIST2=n:\data\data02\excelfile.xls
...
FILELISTN=N
[/pre]
being N the total count of files to be imported.

Then you just need to issue a proc import for each FILELISTX using a macro loop...
[pre]
%macro load_data;

* erase data of previous run, if any;
proc datasets lib=WORK nolist;
delete INDATA:;
quit;

* loop to total count of files;
%do I=1 %to &FILELISTN;

* import file;
proc import out=WORK.INDATA&I
file="&&FILELIST&I"
dbms=xls replace;
getnames=yes;
run;

* append data;
proc append base=INDATA data=INDATA&I force;
run;

%end;
%mem load_data;

%load_data; * run macro;
[/pre]
Finally you should get a WORK.INDATA dataset with the concatenated data imported from the specified excel files.

Code above not tested!

Cheers from Portugal.

Daniel Santos @ www.cgd.pt
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Suggestion to OP Norman21 - considering it's your first post, suggesting you want to create new posts on the forums when you have a question/problem/discussion item, instead of piggy-backing on another individual's prior post - if you want to reference a prior post, then past a link to it in your "new" post.

Scott Barry
SBBWorks, Inc.
Norman21
Lapis Lazuli | Level 10
Hi Daniel,

Thanks for this, but it gives an error (the first underline is at the comma before catx):

7 call symput(cats('FILELIST',put(_N_,best.))),catx(' ',symget('FILELIST'),EXTERNALNAMES);
- -
79 22
200 76
------
252
ERROR: Undeclared array referenced: catx.
ERROR 79-322: Expecting a ;.

ERROR 22-322: Syntax error, expecting one of the following: +, =.

ERROR 200-322: The symbol is not recognized and will be ignored.

ERROR 76-322: Syntax error, statement will be ignored.

ERROR 252-185: The SYMPUT subroutine call does not have enough arguments.

8 * save count of FILELISTX created;
9 call symput('FILELISTN', put(_N_,best.));
10 run;

NOTE: The SAS System stopped processing this step because of errors.

And in future I'll create a new post for a new question!
Norman.
SAS 9.4 (TS1M6) X64_10PRO WIN 10.0.17763 Workstation

sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Right from the start, the SAS DOC for CALL SYMPUT shows only two arguments -- from the executed code, I see at least three arguments - likely parenthese location.

Scott Barry
SBBWorks, Inc.
DanielSantos
Barite | Level 11
Yes. Misplaced parenthese. I did not got the time to test the code, the correct syntax would be:
[pre]
call symput( cats('FILELIST',put(_N_,best.)), catx(' ',symget('FILELIST'),EXTERNALNAMES) );
[/pre]
Cheers from Portugal.

Daniel Santos @ www.cgd.pt
Norman21
Lapis Lazuli | Level 10
An update on my problem....

It turns out the files with an XLS suffix are not Excel files, although they can be opened with Excel. I've had to do the task by hand - open each file in turn, and copy/paste the rows to a "master" Excel file. Crude but effective.
Norman.
SAS 9.4 (TS1M6) X64_10PRO WIN 10.0.17763 Workstation

Peter_C
Rhodochrosite | Level 12
sas is more effective.
Since these are not excel files but text files like CSV, an optimal solution might be to use pipe like[pre]filename myfiles pipe "dir /s/b ""mypathfile*pattern"" " lrecl=1000 ;[/pre]which searches for files named according to the name pattern you seek in the path defined, and returns infile buffers with the path\filenames.
This data step would load a text version of the first 10 columns from each of these files [pre] data loaded_as_text( compress=yes keep= col1-col10 file) ;
length col1-col10 $20 fil file $500;
infile myfiles truncover ;
input fil $500. ;
if fil ne ' ' then do ;
file_finished = 0;
infile dum filevar= fil dsd dlm= ',' missover end= file_finished ;
file = fil ;
do while( not file_finished );
input col1 - col10 ;
output ;
end ;
end ;
run ;[/pre]The syntax of the command returning the list of path\filenames is op.sys specific and the demo is for windows.

it is the absence of manual handling (cut-n-paste) that makes it more effective
imho


peterC

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 3288 views
  • 0 likes
  • 5 in conversation