I want to count the number of files in specific folder and specific kind of files. How I can achieve it in efficient way. I am looking because depending on the count I need to run another macro. In the image I mentioned How my folder structure and also how I want a outputs ('account' is the major folder which have Two subfolders with 'Expense' and 'Profits' which contains files.)
Thank you for your inputs.
data filenames_;
length fref $8 fname $200;
did = filename(fref,"c:\documents\accounts");
did = dopen(fref);
/* count= dnum(did);*/
do i = 1 to dnum(did);
fname = dread(did,i);
;
output;
end;
did = dclose(did);
did = filename(fref);
keep fname;
run;
If option XCMD set then I find it coding wise often easier and "shorter" to use an OS command for recursive file listings. Below how this could look-like for Windows.
data work.path_and_filename;
infile 'dir "c:\documents\accounts\*" /B/S/A-D/ON' pipe truncover;
input path_file $1000.;
length path $1000 file_name $60 suffix $10.;
path=substr(path_file,1,findc(path_file,'\',-1000)-1);
file_name=scan(path_file,-1,'\');
suffix=scan(path_file,-1,'.');
if findc(suffix,'\') then call missing(suffix);
/* restrict to certain suffixes */
/*if lowcase(suffix) in ('sas','txt','csv');*/
run;
proc sql;
create table counts as
select
path,
suffix,
count(*) as n_files
from work.path_and_filename
group by
path,
suffix
;
quit;
PROC FREQ can do this simple counting.
Thanks, I will try that way, I have like hundreds of subfolders ( I thought Freq is not efficient) so, I am looking this way where it automatically goes to the path provided and counts the file number.
@SASuserlot wrote:
Thanks, I will try that way, I have like hundreds of subfolders ( I thought Freq is not efficient) so, I am looking this way where it automatically goes to the path provided and counts the file number.
Do you mean you want to subset the list of files to just those for a particular directory?
If you have a dataset that like the one generated by this macro: https://github.com/sasutils/macros/blob/master/dirtree.sas
You could count how many files are directly in each directory in the tree by aggregating by the PATH variable.
But what do you mean by TYPE? Are you talking about the extension on the file? You could create a variable that has just the extension part of the filename (the part after the last period).
data want;
set dirtree;
if index(filename,'.') then extension=scan(filename,-1,'.');
run;
You could then summarize by the extension in each directory.
proc freq data=want;
tables path*extension / list;
run;
@SASuserlot wrote:
Thanks, I will try that way, I have like hundreds of subfolders ( I thought Freq is not efficient) so, I am looking this way where it automatically goes to the path provided and counts the file number.
FREQ is very efficient. And with "hundreds" of folders (as opposed to millions), its hard to imagine you'll find a noticeably faster way. Certainly you could do lots of coding and achieve small increases in speed.
If option XCMD set then I find it coding wise often easier and "shorter" to use an OS command for recursive file listings. Below how this could look-like for Windows.
data work.path_and_filename;
infile 'dir "c:\documents\accounts\*" /B/S/A-D/ON' pipe truncover;
input path_file $1000.;
length path $1000 file_name $60 suffix $10.;
path=substr(path_file,1,findc(path_file,'\',-1000)-1);
file_name=scan(path_file,-1,'\');
suffix=scan(path_file,-1,'.');
if findc(suffix,'\') then call missing(suffix);
/* restrict to certain suffixes */
/*if lowcase(suffix) in ('sas','txt','csv');*/
run;
proc sql;
create table counts as
select
path,
suffix,
count(*) as n_files
from work.path_and_filename
group by
path,
suffix
;
quit;
Thank you @Patrick , It worked for what I am looking.
Special thank you @PaigeMiller @Tom for taking your time and provide the valuable alternatives. I tried the your approach as well. Which do the same job as expected. I really thank you for all your time.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.