BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Peter_C
Rhodochrosite | Level 12
Hi Kevin
Any extension of this simple routine should be cautious about handling binary files like .xlsx format. The simple precision of the DSD option in handling "xxxxx separated values" won't work for binary data. However, it only needs a DLM= '09'x added to the INFILE statement to switch to tab separated values. If there is acceptable standard that .csv files need DLM= ',' and .tsv files need DLM= '09'x then these could be dynamically supported in folders with a the mix of file types, because the DLM= value can name a variable which you change depending on the file type/extension. Timing takes care and probably, practise. My guess is the added complexity is not well rewarded.
By the way, the DLM= value can hold more than one character. These then are alternative delimiters. So, if tabs only appear as delimiters in .csv files and likewise commas are protected (by quotes) when not to be treated as delimiters .... then the DLM could be that combination '2c09'x
Kevin_Viel
Obsidian | Level 7

Peter,

 

  I thought that you were just using this technique to get the filenames.  You could forgo the variables and just use INPUT ;

 

data pdf_files ;

  length file 
         filename $ 256
          ;

  infile "C:\Users\kviel\Documents\My SAS Files\9.4\Documentation\*.pdf" 
         filename = file
         eov = eov
         ;

  input ;

  filename = file ;

  if    _n_ = 1
     or eov = 1
  then 
    do ; 
       output ;
       eov = 0 ;
    end ;

run ;

This is not recursive and, unless one is CERTAIN that the folder contains no child folder, then it reads only files of the specified extension.  I toyed with the MEMVAR= option, but I ran out of time.  I am not even sure if that is the right direction.  Again, this reads every record of every file, so it is not as efficient as Kurt's suggestion.

 

Kind regards,

 

Kevin

Peter_C
Rhodochrosite | Level 12
Hi Kevin
My suggestion was only for readkng contents of the files. I wouldn't use it for file names. However, with some scenarios (e.g. expecting only small files) your proposal works, but excludes empty files. I don't remember what happens for subfolders.

Peter 
ScottBass
Rhodochrosite | Level 12

@Peter_C wrote:
Hi Kevin
My suggestion was only for readkng contents of the files. I wouldn't use it for file names. However, with some scenarios (e.g. expecting only small files) your proposal works, but excludes empty files. I don't remember what happens for subfolders.

Peter 

 

1) If the red text above is correct, then your solution, while it works, isn't on topic.  Well, the code is on topic, just not your red text above. The subject was "List all the files in a folder".

2) Even if the proposed code works, I just can't see how reading every line in a file just to get the filename is a good approach?  Sure, for small text files in a directory, it might even work faster than the SAS functions used in previous solutions - the data step is very fast at reading text files.  

 

But put 100,000 1MB each XML or binary files in a directory and test the performance of the all the various solutions posted in this thread.  You don't know a priori the contents of the directory and can't assume its contents.


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
Peter_C
Rhodochrosite | Level 12
thanks Scott
Probably put it down to my ARADD
(age related attention deficit disorder)😎
Kevin_Viel
Obsidian | Level 7

Peter,

 

  It fails when a folder is present:

 

ERROR: Invalid file, C:\Users\kviel\Documents\My SAS Files\9.4\Documentation\New folder.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.PDF_FILES may be incomplete.  When this step was stopped there were 0 observations and 1 variables.
WARNING: Data set WORK.PDF_FILES was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

Kind regards,

 

Kevin

Kurt_Bremser
Super User

You need to check if a "file" is actually a directory:

%macro size(directory);
%local did i name subdir fref fref2 size;
%let size = 0;
%let did = %sysfunc(filename(fref,&directory));
%let did = %sysfunc(dopen(&fref));
%if &did ne 0
%then %do;
  %do i = 1 %to %sysfunc(dnum(&did));
    %let name = &directory/%sysfunc(dread(&did,&i));
    %let subdir = %sysfunc(filename(fref2,&name));
    %let subdir = %sysfunc(dopen(&fref2));
    %if &subdir ne 0
    %then %do;
      %let subdir=%sysfunc(dclose(&subdir));
      %let size = %eval(&size + %size(&name));
    %end;
    %else %do;
      %let fid = %sysfunc(fopen(&fref2));
      %let size = %eval(&size + %sysfunc(finfo(&fid,Dateigröße (Byte))));
      %let fid = %sysfunc(fclose(&fid));
    %end;
    %let subdir=%sysfunc(filename(fref2));
  %end;
  %let did=%sysfunc(dclose(&did));
%end;
%let did=%sysfunc(filename(fref));
&size
%mend;
%put size=%size(/folders/myfolders);
contactnishan
Obsidian | Level 7
proc contents data=libname._all_;
run;

create a library for the folder. then run the above code. I guess it should help. 

Kevin_Viel
Obsidian | Level 7

Contactnishan,

 

  Will that list all files or all SAS data sets (and views) only?

 

Kind regards,

 

Kevin

contactnishan
Obsidian | Level 7
Sorry, it lists Only SAS files in the folder/library. But out option can be used to produce a list of sas files though.
neil011
Fluorite | Level 6

create a library and use dictionary.tables

 

libname name "user/folder/location";


proc sql;
create table x as select libname, memname
from dictionary.tables
where upcase(libname)="NAME";

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 25 replies
  • 60185 views
  • 51 likes
  • 11 in conversation