I thought that you were just using this technique to get the filenames. You could forgo the variables and just use INPUT ;
data pdf_files ; length file filename $ 256 ; infile "C:\Users\kviel\Documents\My SAS Files\9.4\Documentation\*.pdf" filename = file eov = eov ; input ; filename = file ; if _n_ = 1 or eov = 1 then do ; output ; eov = 0 ; end ; run ;
This is not recursive and, unless one is CERTAIN that the folder contains no child folder, then it reads only files of the specified extension. I toyed with the MEMVAR= option, but I ran out of time. I am not even sure if that is the right direction. Again, this reads every record of every file, so it is not as efficient as Kurt's suggestion.
My suggestion was only for readkng contents of the files. I wouldn't use it for file names. However, with some scenarios (e.g. expecting only small files) your proposal works, but excludes empty files. I don't remember what happens for subfolders.
1) If the red text above is correct, then your solution, while it works, isn't on topic. Well, the code is on topic, just not your red text above. The subject was "List all the files in a folder".
2) Even if the proposed code works, I just can't see how reading every line in a file just to get the filename is a good approach? Sure, for small text files in a directory, it might even work faster than the SAS functions used in previous solutions - the data step is very fast at reading text files.
But put 100,000 1MB each XML or binary files in a directory and test the performance of the all the various solutions posted in this thread. You don't know a priori the contents of the directory and can't assume its contents.
It fails when a folder is present:
ERROR: Invalid file, C:\Users\kviel\Documents\My SAS Files\9.4\Documentation\New folder. NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set WORK.PDF_FILES may be incomplete. When this step was stopped there were 0 observations and 1 variables. WARNING: Data set WORK.PDF_FILES was not replaced because this step was stopped. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds
You need to check if a "file" is actually a directory:
%macro size(directory); %local did i name subdir fref fref2 size; %let size = 0; %let did = %sysfunc(filename(fref,&directory)); %let did = %sysfunc(dopen(&fref)); %if &did ne 0 %then %do; %do i = 1 %to %sysfunc(dnum(&did)); %let name = &directory/%sysfunc(dread(&did,&i)); %let subdir = %sysfunc(filename(fref2,&name)); %let subdir = %sysfunc(dopen(&fref2)); %if &subdir ne 0 %then %do; %let subdir=%sysfunc(dclose(&subdir)); %let size = %eval(&size + %size(&name)); %end; %else %do; %let fid = %sysfunc(fopen(&fref2)); %let size = %eval(&size + %sysfunc(finfo(&fid,Dateigröße (Byte)))); %let fid = %sysfunc(fclose(&fid)); %end; %let subdir=%sysfunc(filename(fref2)); %end; %let did=%sysfunc(dclose(&did)); %end; %let did=%sysfunc(filename(fref)); &size %mend; %put size=%size(/folders/myfolders);
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.