files to extract

Reply
Super Contributor
Posts: 371

files to extract

I have two kind of files.

- first type : it contains the word : toto

- second type: it contains toto and nono.

I want to extract only :

- the files have only : toto

- the only the party contains toto from the files mixed(it contains toto and nono).

is it possible to do that quickly in sas ?

What's about sas and unix for that ?

Thank you

Super User
Super User
Posts: 6,372

Re: files to extract

Can you clarify what you mean?

What do you mean by "FILE".  Are you talking about a text file on a disk somewhere? A SAS dataset?

What do you mean by "contains the word: toto"?  Is it that the file's name contains those four characters?  Or does it mean that the text files contains that word somewhere in it? Of if you are talking about a data set then what variables in the dataset do you want to search for these characters.

What do you mean by "EXTRACT"?  Normally that means to copy out a subset of the data into a new location.  Do you want to copy some of the files? Or some lines from the text files? Or some observations from data sets?

Super Contributor
Posts: 371

Re: files to extract

Thank for your message. Sorry, I will try to be clear.

Are you talking about a text file on a disk somewhere?

Yes, it is a text file( but it is not a sas file: .txt, .log)

What do you mean by "contains the word: toto"?  Is it that the file's name contains those four characters?  Or does it mean that the text files contains that word somewhere in it?

the text file contains the word : toto

What do you mean by "EXTRACT"?

I want to copy the files contain the word : toto in new directory.

exemple :

in my directory, i have so many files text

D/f1.txt

/f2_xx-yy-tt.log

//fk_xx-yy-tt.log

.....

/fn.txt

I will have two types of file

> a text file(fi.txt) has a word toto, but it has not the word: nono.

file: fi.txt

yyyyyyyyyy toto: uiiii  jjj mmm ooo

uuuuuuuuu   toto: hhh oo hhhhh vvv

......................................

rrrrrrrrrrrrrrr  toto: ppp iiii    llllll mm

> a text file(fj.txt) has a word  toto and nono

file: fi.txt

yyyyyyyyyy toto : uiiii  jjj mmm ooo

uuuuuuuuu  nono: hhh oo hhhhh vvv

......................................

llllllllllllllllllll  toto: ppp iiii    llllll mm

What I need ?

1) to keep the files name with a file name contains the word : fk_xx-yy-tt, i will have a group a files : gr_files

2) from this group, i will check, if the file text contains the word : toto

   a) if the file text contains only the word : toto, like this

yyyyyyyyyy toto: uiiii  jjj mmm ooo

I want to keep only the text : uiiii  jjj mmm ooo

b)a) if the file text contains the word : toto and nono, like this

yyyyyyyyyy toto : uiiii  jjj mmm ooo

uuuuuuuuu  nono: hhh oo hhhhh vvv

in the first, i want to keep yyyyyyyyyy toto : uiiii  jjj mmm ooo, in the second, I will keep : uiiii  jjj mmm ooo

at the end, i will have a file likes this :

uiiii  jjj mmm ooo

3) I will copy the new files in a new directory  new_dir

new_dir/ fk_xx-yy-tt.txt

fu_xx-yy-tt.txt

fs_xx-yy-tt.txt

fw_xx-yy-tt.txt

......

I think using the sas "infile", but it can a long time for so many files ? why not sas and unix : grep -l ?

Super User
Super User
Posts: 6,372

Re: files to extract

It is probably much easier to code it using GREP.  If you are running on Windows there are version of grep that you can get.

1) get a list of files with 'toto' . (case insensitive)

2) get a list of files with 'nono'

3) merge the lists

4) generate commands to copy the files.

%let sdir=/tmp/source ;

%let tdir=/tmp/target ;

data toto ;

  infile "cd &sdir; grep -i -w -l 'toto' *.txt" pipe truncover;

  input fname $256. ;

  toto=1;

run;

data nono ;

  infile "cd &sdir; grep -i -w -l 'nono' *.txt" pipe truncover;

  input fname $256. ;

  nono=1;

run;

data both ;

  merge toto nono;

  by fname ;

run;

data _null_;

  set both ;

  where toto and nono;

  length cmd $500 ;

  cmd = catx(' ','cp -p',"&sdir/"||fname,"&tdir/"||fname);

  infile cp pipe filevar=cmd end=eof;

  do while(not eof);

    input;

    put _infile_;

  end;

run;

Super Contributor
Posts: 371

Re: files to extract

Thank you for answer

Super Contributor
Posts: 371

Re: files to extract

Hello Tom,

If we want to extract file who satisfied the two conditions:

file name contains : xxx_vv_bb and file text contains the word : sheet.

Is it possible to use, in sas , and in the same line code, the unix commands : find and grep , if yes how ?

is it the quick one ?

Thank you

Grand Advisor
Posts: 9,593

Re: files to extract

Use OS command LS or DIR to get list of file name

OR Use filename + pipe.

data toto toto_nono;
input fvar $80.;
length filename fname $ 80;
infile dummy filevar=fvar filename=fname end=last;
filename=fname;
do while(not last);
 input; 
 if indexw(_infile_,'toto') then toto=1;
 if indexw(_infile_,'nono') then nono=1;
end; 
if toto and nono then output toto_nono;
 else if toto then output toto;
keep filename;
cards;
c:\temp\x.txt
c:\temp\y.log
;
run;

Xia Keshan

Super Contributor
Posts: 371

Re: files to extract

Thank you for your answer.

Ask a Question
Discussion stats
  • 7 replies
  • 345 views
  • 1 like
  • 3 in conversation