BookmarkSubscribeRSS Feed
atul_desh
Quartz | Level 8

I want dataset with below two variable from one of directory(unix) location where multiple text files are present.. is there any way to do it ??

 

Name of File,   Number of records

 

 

Thanks

9 REPLIES 9
thomp7050
Pyrite | Level 9

I suspect you want filename pipe command.  I have only done this for windows command prompt but it may work for linux too?

 

Filename filelist pipe " DIR S:\MyFiles /S /B  /A:D "; 
                                                                                   
Data ALLFOLDERS;                                        
	Infile filelist truncover;
	Input filename $100.; 
Run; 

Then filter out any files that are not .txt.

 

Edit: revised to eliminate unnecessary code.

atul_desh
Quartz | Level 8

Thanks for the code but I don't have permission to use pipe .. is there any other way to do it ??

thomp7050
Pyrite | Level 9

Not that I know of in SAS, no.  If it were me I would consider creating a text file via another language (e.g. python, etc) and import the text file into SAS.  Sorry man!

RW9
Diamond | Level 26 RW9
Diamond | Level 26

If you can't access the OS from SAS then how are you gonig to get the directory listing?  If you have that then you could doa basic read of each file, and keep a count as it reads to get the output, but you need to feed it the list of files.

I would however also question why you have loads of files but don't know what they contain?

 

Kurt_Bremser
Super User
%let path=path_to_your_files;

filename oscmd pipe "cd &path.;wc -l *.txt";

data want;
length filename $30 lines 8;
infile oscmd;
input filename lines;
if filename ne 'total';
run;

of course this needs XCMD enabled; one of the many reasons I consider disabling XCMD as stupid.

Without XCMD, you can use a wildcard in the infile statement of a data step:

data want (keep=filename lines);
length filnam filename $200;
infile "&path./*.txt" filename=filnam end=done;
retain filename;
input;
if filnam ne filename
then do;
  if filename ne " " then output;
  lines = 0;
  filename = filnam;
end;
lines + 1;
if done then output;
run;

 

See the first of these two examples as application of Maxims 14 and 15.

Oligolas
Barite | Level 11

Hi,

 

my solution isn't that elegant 😉

filename location "C:\TEMP";
data files;
   length name $250 nbRec 8;
   drop rc did i;
   did=dopen("location");
   if did > 0 then do;
   do i=1 to dnum(did);
     name=pathname('location')||'\'||dread(did,i);
     if scan(name,-1,'.') eq 'txt' then output;
     end;
   rc=dclose(did);
   end;
   else put 'Could not open directory';
run;

data _NULL_;
   set files;
   call execute('
      data _null_;
      infile "'||strip(name)||'" end=eof;
      input;
      if eof then call execute("
         proc sql;
            update files set nbRec="||put(_N_,best32.)||"
            where name eq ""'||strip(name)||'"";
         quit;
      ");
      run;
   ');
run;

Cheers

 

________________________

- Cheers -

Tom
Super User Tom
Super User

If you are running on Unix then use the wc command.

But you can use just a simple data step if you want.

Play with this code. Figuring out when SAS sets the EOV flag can be tricky so make sure to test it with some one record files and make sure it works.

data want;
  length fname $255 filename $100;
  infile '*.txt' filename=fname end=eof eov=eov;
  input ;
  n+eof;
  if eov or eof then do;
   filename=scan(fname,-1,'/');
   output;
   n=1;
   eov=0;
  end;
  else n+1;
run;
ballardw
Super User

There are a slew of SAS functions to interact with external files starting with DOPEN to open an identified directory, DINFO, DOPTNUM, DOPTNAME to return information about a directory and then file functions. If the operating system will return the number of records then one of the results associated with FOPTNAME and And FOPTNUM will have it.

I don't work on unix and would not presume to attempt to guess with different flavors of unix which option names you would need to access. The online help starting with DOPEN should lead you to a solution. Note that the examples will tend to show a macro and a data step approach. I recommend staying with the data step unless you are very comfortable with macro language.

Don't forget to Fclose and Dclose each directory or file opened.

Kurt_Bremser
Super User

@ballardw wrote:

 

I don't work on unix and would not presume to attempt to guess with different flavors of unix which option names you would need to access.


While this is true for quite a lot of utilities used especially in the "commercial" UNIXen (AIX, HP-UX, Solaris), Linux stays with the GNU utilities, so the syntax of commandline programs is the same across different distributions. Notably IBM has made it a point to make AIX more Linux-compatible with every release from 4.3 on.

And utilities like the wordcount (wc) are so old (and have not changed their options for decades) that my example will work on all UNIX platforms. One can even get those utilities for Windows, increasing its usability.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 3414 views
  • 2 likes
  • 7 in conversation