BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ilikesas
Barite | Level 11

Hi,

 

In a previous question I asked how to list all files (of a certain type) from a directory and all of its subdirectories. Now what I would like is to list all files of all types. I tried to do a small modification to Patrick's code:

 

 

%macro list_files(dir);
  %local filrf rc did memcnt name i;
  %let rc=%sysfunc(filename(filrf,&dir));
  %let did=%sysfunc(dopen(&filrf));      

   %if &did eq 0 %then %do; 
    %put Directory &dir cannot be open or does not exist;
    %return;
  %end;

   %do i = 1 %to %sysfunc(dnum(&did));   

   %let name=%qsysfunc(dread(&did,&i));

     
        %if %sysfunc(findw(%qscan(&name,-1,\),.)) ne 0   %then %do;

        data _tmp;
          length dir $512 name $100;
          dir=symget("dir");
          name=symget("name");
        run;
        proc append base=want data=_tmp;
        run;quit;

      %end;
      %else %if %qscan(&name,2,.) = %then %do;        
        %list_files(&dir\&name)
      %end;

   %end;
   %let rc=%sysfunc(dclose(&did));
   %let rc=%sysfunc(filename(filrf));     

%mend list_files;

%list_files(C:\Documents and Settings\HP_Administrator\Desktop\files)

Here I omitted theextension parameter becasue I want all files of all extensions, and  the I did a slight modification to the 2nd if:

 

%if %sysfunc(findw(%qscan(&name,-1,\),.)) ne 0   %then %do;

So I want to scan the name starting fromt the right and extracting the substring until the first '\', and within this substring I want to find a '.' (dot) which signals that it is an extension. But when I ran the macro I didn't get any table at all. Tried different approaches but still nothing. Please correct me.

 

Thanks! 

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

I was too quick in my earlier response. I think the following will do what you want:

 

%macro list_files(dir,ext);
  %local filrf rc did memcnt name i;
  %let rc=%sysfunc(filename(filrf,&dir));
  %let did=%sysfunc(dopen(&filrf));      

   %if &did eq 0 %then %do; 
    %put Directory &dir cannot be open or does not exist;
    %return;
  %end;

   %do i = 1 %to %sysfunc(dnum(&did));   

   %let name=%qsysfunc(dread(&did,&i));

/*      %if %qupcase(%qscan(&name,-1,.)) = %upcase(&ext) %then %do;*/
      %if %qscan(&name,2,.) ne %then %do;
        %put &dir\&name;

        data _tmp;
          length dir $512 name $100;
          dir=symget("dir");
          name=symget("name");
        run;
        proc append base=want data=_tmp;
        run;quit;

      %end;
      %else %if %qscan(&name,2,.) = %then %do;        
        %list_files(&dir\&name,&ext)
      %end;

   %end;
   %let rc=%sysfunc(dclose(&did));
   %let rc=%sysfunc(filename(filrf));     

%mend list_files;
%list_files(C:\SASUniversityEdition\myfolders,*)

HTH,

Art, CEO, AnalystFinder.com

View solution in original post

17 REPLIES 17
Reeza
Super User

The If condition is incorrect. To debug the macro Add some %PUT as checkpoints in your code and see how the logic is evaluating. 

 

If you check the SAS 9.4 macro appendix it has an example of what you want. 

 

https://communities.sas.com/t5/SAS-Communities-Library/SAS-9-4-Macro-Language-Reference-Has-a-New-Ap...

ilikesas
Barite | Level 11

Hi Reeza,

 

thanks for the links. In fact, the code that Patrick wrote to answer my previous question is a direct modification of Example1 - it is from this example that I was inspired to ask that question!

 

Now I am trying to find a modification to Patrick's code which will enable to extract all  files names and paths of all file types from all subdiretories 

art297
Opal | Level 21

Only one change has to be made to the original macro. The following has the change made (with the old code commented out):

 

%macro list_files(dir,ext);
  %local filrf rc did memcnt name i;
  %let rc=%sysfunc(filename(filrf,&dir));
  %let did=%sysfunc(dopen(&filrf));      

   %if &did eq 0 %then %do; 
    %put Directory &dir cannot be open or does not exist;
    %return;
  %end;

   %do i = 1 %to %sysfunc(dnum(&did));   

   %let name=%qsysfunc(dread(&did,&i));

      %if %length/*%qupcase*/(%qscan(&name,-1,.)) gt 0 /*= %upcase(&ext)*/ %then %do;
        %put &dir\&name;

        data _tmp;
          length dir $512 name $100;
          dir=symget("dir");
          name=symget("name");
        run;
        proc append base=want data=_tmp;
        run;quit;

      %end;
      %else %if %qscan(&name,2,.) = %then %do;        
        %list_files(&dir\&name,&ext)
      %end;

   %end;
   %let rc=%sysfunc(dclose(&did));
   %let rc=%sysfunc(filename(filrf));     

%mend list_files;
%list_files(C:\SASUniversityEdition\myfolders,*)

HTH,

Art, CEO, AnalystFinder.com

ilikesas
Barite | Level 11

Hi art297,

 

thanks for the code. I ran it, but it didn't go into the subdirectories.

 

From what I understand, the part of the code:

%if %length(%qscan(&name,-1,.)) gt 0 %then %do;

looks at the name of the files, and from it substrings everything fromt he right until it encounters the firsr "." - if such a substring exists than it means that there is a "." and therefore there is an extension and therefore it is a file whose name and path can be extracted

 

But what is strange is that I obtain the names of the second level subfolders (although they do not contain a ".") - but not the names of the files within these subfolders

ilikesas
Barite | Level 11

Hi again art297,

 

I think that I got what I wanted. In my original code I used  %sysfunc(findw()), but what I should have used instead is the %index function.

 

With the following code:

 

%macro list_files(dir);
  %local filrf rc did memcnt name i;
  %let rc=%sysfunc(filename(filrf,&dir));
  %let did=%sysfunc(dopen(&filrf));      

   %if &did eq 0 %then %do; 
    %put Directory &dir cannot be open or does not exist;
    %return;
  %end;

   %do i = 1 %to %sysfunc(dnum(&did));   

   %let name=%qsysfunc(dread(&did,&i));

   %if %index(%qscan(&name,-1,'\'),.) gt 0   %then %do;


        data _tmp;
          length dir $512 name $100;
          dir=symget("dir");
          name=symget("name");
        run;
        proc append base=want data=_tmp;
        run;quit;

      %end;
      %else %if %qscan(&name,2,.) = %then %do;        
        %list_files(&dir\&name)
      %end;

   %end;
   %let rc=%sysfunc(dclose(&did));
   %let rc=%sysfunc(filename(filrf));     

%mend list_files;
%list_files(C:\Documents and Settings\HP_Administrator\Desktop\files)

I got the names and paths of all files within all subdirectories. 

ilikesas
Barite | Level 11

Hi Tom,

 

actually, that is my previous question!

 

I noticed 

art297
Opal | Level 21

I was too quick in my earlier response. I think the following will do what you want:

 

%macro list_files(dir,ext);
  %local filrf rc did memcnt name i;
  %let rc=%sysfunc(filename(filrf,&dir));
  %let did=%sysfunc(dopen(&filrf));      

   %if &did eq 0 %then %do; 
    %put Directory &dir cannot be open or does not exist;
    %return;
  %end;

   %do i = 1 %to %sysfunc(dnum(&did));   

   %let name=%qsysfunc(dread(&did,&i));

/*      %if %qupcase(%qscan(&name,-1,.)) = %upcase(&ext) %then %do;*/
      %if %qscan(&name,2,.) ne %then %do;
        %put &dir\&name;

        data _tmp;
          length dir $512 name $100;
          dir=symget("dir");
          name=symget("name");
        run;
        proc append base=want data=_tmp;
        run;quit;

      %end;
      %else %if %qscan(&name,2,.) = %then %do;        
        %list_files(&dir\&name,&ext)
      %end;

   %end;
   %let rc=%sysfunc(dclose(&did));
   %let rc=%sysfunc(filename(filrf));     

%mend list_files;
%list_files(C:\SASUniversityEdition\myfolders,*)

HTH,

Art, CEO, AnalystFinder.com

ilikesas
Barite | Level 11

That is even simpler!

 

So you take the name of the file, if there is anything after the "." then it is  a file name and therefore extract it. If there is nothing after the "." (which here is the same thing as there being no "." at all), then it must be a folder and therefore repeat the process for this folder.

 

Thanks!

Tom
Super User Tom
Super User

In general data step code will be much easier to understand than macro and certainly much easier to debug. The only complication with the data step that Roger posted is that it uses the MODIFY command which is not very much used and so could take a little understanding.  Basically the use of the MODIFY command allows the data step to append records for all of the file names in any directories it found and those records will then be processed later at the step workd its way thorugh the data file.

 

Let's take that data step and add some logic to allow you capture the filename and extension.  You could then subset the result later based on the extension, or even modify the data step to remove the records for files with extensions you don't care about.

 

The first part is to set up a dataset with the top level directory that you want to search from. Note that if you want to search multiple top level directories you could make this dataset have more than one observation.  Let's create a variable to indicate if the record is for directory or not (DIR) the full name (FULLPATH), the directory part (DIRNAME) and file within the directory (FILENAME) and the extension, if any, (EXT).  For this example we will just set a macro variable with the starting point. You can see how easy it would be to wrap this into a macro definition and make DIR the parameter to the macro.

%let dir=%sysfunc(pathname(work));
data want ;
  length dir 8 ext filename dirname $256 fullpath $512 ;
  call missing(of _all_);
  fullpath = "&dir";
run;

Now let's look at the guts of the logic that takes the top level node and scans the directory tree.

data want ;
  modify want ;
  sep='/';
  if "&sysscp"="WIN" then sep='\' ;
  rc=filename('tmp',fullpath);
  dir_id=dopen('tmp');
  dir = (dir_id ne 0) ;
  if dir then dirname=cats(fullpath,sep);
  else do;
    filename=scan(fullpath,-1,sep) ;
    dirname =substrn(fullpath,1,length(fullpath)-length(filename));
    if index(filename,'.')>1 then ext=scan(filename,-1,'.');
  end;
  replace;
  if dir then do;
    do i=1 to dnum(dir_id);
      fullpath=cats(dirname,dread(dir_id,i));
      output;
    end;
    rc=dclose(dir_id);
  end;
  rc=filename('tmp');
run;

A fiew things to note about using the MODIFY statement. (1) Any new variables introduced in the data step (like SEP, DIR_ID, RC) are not output. There is no need to add a DROP statement for them. (2) You can both replace an existing record (REPLACE statement) and add new records (OUTPUT statement). 

 

I added some logic to split out the path into DIRNAME, FILENAME and EXT components. Note that for calculating the EXT is important to first check if the filename actual has a period in it. On Linus systems for sure you can make files without any extension and also you can make files with period as the first character of the filename. Note that these are considered hidden fles by Linux and it looks like the DNUM() and DREAD() functions ignore them.

 

Also I added some conditional logic to figure out if you are running on Windows or Linux and use the appropriate slash in the file names.

 

So the logic flow of this is to first create a fileref for the current observation and then try to open that file as a directory.  We then set the variables (other than FULLPATH) based on whether it is a directory or not and replace the record in the data set with these updated values.  Then if it IS a directory we add new records for all of the files (files and sub directories) in that directory.  These will appended to the end of the dataset so that they will later be processed by the top of the datastep when the MODIFY statement reaches them.

 

RZ1
Calcite | Level 5 RZ1
Calcite | Level 5

@ Tom  This is by far the best solution ever. Never realized that "modify" statement can be so handy...

 

With function finfo() it can also display the files' attributes. Great stuff!!!

Nmah
Calcite | Level 5

Hi,

 

Could you help me understand how I can get the file attributes for all files in the main and sub directories.

I am using the code above shared by Tom, getting the file attributes for the ones under main directory but not for the ones under sub directories.

 

 

TIA

Nmah

 

art297
Opal | Level 21

I think you'll get more responses if you post your question as a new thread AND include the @Tom's code that you're referring to.

 

Art, CEO, AnalystFinder.com

 

Tom
Super User Tom
Super User

@Nmah wrote:

Hi,

 

Could you help me understand how I can get the file attributes for all files in the main and sub directories.

I am using the code above shared by Tom, getting the file attributes for the ones under main directory but not for the ones under sub directories.

 

 

TIA

Nmah

 


Sounds like you put the new logic in the wrong place.  It should be before the REPLACE statement.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 17 replies
  • 13375 views
  • 9 likes
  • 8 in conversation