Hi guys,
suppose to have a number of .csv files (50) in the same folder.
Each file is named: PRJ__20221201_13227_MF
PRJ__20221201_13227_TR
PRJ__20221201_13227_RECOV
.............
PRG__
20221201_
13227_
are present and the same in all file names.
Is there a way to read all the files in input without doing "proc import" for all files every time?
It is preferable that the read files be named with the suffix, e.g., MF, TR, RECOV.
Thank you in advance
Do all of the files use the same structure? That is they have the same variables in the same order?
If so then yes you can read multiple files as if they were one.
For example you can use a wild card in the INFILE statement. You can use the FILENAME= option to get the current file's name so you can skip the header records. Plus if you need to pull out the suffix on the filename as a variable you can do that.
So for example if each file has 4 variables, two of which are character strings and one is a date and the last is a plain old number then the code might look like this:
data want;
length filename $256 suffix $8 ;
infile 'PRJ__20221201_13227_*.csv' dsd truncover filename=filename;
input @;
if filename ne lag(filename) then delete;
suffix=scan(filename,-2,'._');
length c1 $10 d1 8 c2 $3 n1 8;
format d1 yymmdd10.;
informat d1 mmddyy.;
input c1--n1;
run;
The best way is to create based on the name of the file a SAS Data Step that will segregate each file type to a specific table and afterwards in each file you will perform a loop that will eventually read each file and append to the target destination.
An example would be something like the below:
/* example of how you should segregate the file based on type*/
data files;
length filename $50 path $100;
infile datalines dlm="|";
input filename $ path $;
datalines;
PRJ__20221201_13227_MF|C:\Users\Prophet\Files\
PRJ__20221201_13227_TR|C:\Users\Prophet\Files\
PRJ__20221201_13227_RECOV|C:\Users\Prophet\Files\
;
run;
data mf tr recov;
set files;
if find(filename, "_MF")>0 then output mf;
else if find(filename, "_TR")>0 then output tr;
else if find(filename, "_RECOV")>0 then output recov;
run;
%macro read_mf;
data mf;
set mf;
id = _N_;
run;
proc sql noprint;
select max(id) as :max_id from mf;
quit;
%do i = 1 %to &max_id;
/*for each file in the location*/
proc sql noprint;
select catx('\',path, filename) into :file where id = &i;
quit;
/*read the contents of the csv via proc import or datastep */
/*append the result in a target table */
%end;
%mend;
%macro read_tr;
/*same for the other files*/
%mend;
%macro read_recov;
/*same for recov*/
%mend;
/*macro invocation*/
%read_mf;
%read_recov;
%read_tr;
You should however, include the path for the file that you are trying to read and store in a sas dataset.
Searching the forums is always a good place to start. This has been discussed many times.
Here is one such discussion: https://communities.sas.com/t5/SAS-Programming/How-to-read-multiple-CSV-files-using-SAS/td-p/863674
This SAS Note shows how to do that:
Do all of the files use the same structure? That is they have the same variables in the same order?
If so then yes you can read multiple files as if they were one.
For example you can use a wild card in the INFILE statement. You can use the FILENAME= option to get the current file's name so you can skip the header records. Plus if you need to pull out the suffix on the filename as a variable you can do that.
So for example if each file has 4 variables, two of which are character strings and one is a date and the last is a plain old number then the code might look like this:
data want;
length filename $256 suffix $8 ;
infile 'PRJ__20221201_13227_*.csv' dsd truncover filename=filename;
input @;
if filename ne lag(filename) then delete;
suffix=scan(filename,-2,'._');
length c1 $10 d1 8 c2 $3 n1 8;
format d1 yymmdd10.;
informat d1 mmddyy.;
input c1--n1;
run;
If these files have the same data constructure, that would be easy by FILEVAR= option.
data path;
input path $80.;
cards;
c:\temp\a\PRJ__20221201_13227_MF.txt
c:\temp\a\PRJ__20221201_13228_MF.txt
;
data MF;
set path;
length from_file file $ 80;
infile dummy filevar=path end=last filename=file truncover;
from_file=file;
do until(last);
input a $ b $ c $;
output;
end;
run;
If you don't like to type file name by hand, try this one:
/*
data path;
input path $80.;
cards;
c:\temp\a\PRJ__20221201_13227_MF.txt
c:\temp\a\PRJ__20221201_13228_MF.txt
;
*/
%let path= c:\temp\a ;
data path;
length path $ 200;
rc=filename('x',"&path.");
did=dopen('x');
do i=1 to dnum(did);
file=dread(did,i);
if prxmatch('/.+MF\.txt$/i',strip(file)) then do;
path=catx('\',"&path.",file);
output;
end;
end;
keep path;
run;
data MF;
set path;
length from_file file $ 200;
infile dummy filevar=path end=last filename=file truncover;
from_file=file;
do until(last);
input a $ b $ c $;
output;
end;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.