BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ilikesas
Barite | Level 11

Hi,

In a previous question I asked how to concatenate files and from the answers given realized that it can be done directly with command prompt with : type "file1" "file2" ... "filen" > "allfiles"

 

But suppose that I want to concatenate files based on their names. For example, I have the files f1,f2, fh1, fh2, is it possible to dynamically concatenate all the f's together and the fh's together etc?

 

Thank you!

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Do you care the order that the files are concatenated?  Are you just interested in concatenating the raw source files or are you actually interested in generating a concatenated DATASET?

 

If you just want to use SAS to copy the raw files together and you can use a simple (single wildcard) pattern to match the file names then a simple date step will do. 

data _null_;
   infile 'fh*.txt' ;
   file 'all_fh.txt' ;
   input;
   put _infile_;
run;

If it is more complex then make a dataset with the list of filenames and then use that to drive the step that reads and copies the files.

data filelist ;
   infile 'dir /b f*.txt' pipe truncover ;
   input filename $256. ;
   if filename =: 'fh' then do; 
     target = 'fh_files.txt';
     number = input(substr(scan(filename,1,'.'),3),32.);
     output;
   end;
   else if filename =: 'f' then do; 
     target = 'f_files.txt';
     number = input(substr(scan(filename,1,'.'),2),32.);
     output;
   end;
run;
proc sort ; by target number ; run;
data _null_;
   set filelist ;
   infile source filevar=filename end=eof ;
   file target filevar=target ;
   do while (not eof);
     input;
     put _infile_;
  end;
run;

   

 

View solution in original post

12 REPLIES 12
jklaverstijn
Rhodochrosite | Level 12

How about "type fh*.* > allfhfiles". But don't you have a SAS question? Those are more fun.

 

- Jan.

ilikesas
Barite | Level 11

Hi Jan,

 

Actuallt it is a very SASy question because here I realize that I will need some sort of a macro but I can't figure out how to write and would greatly appreciate a first push!! 

 

 

thanks!

jklaverstijn
Rhodochrosite | Level 12

Hi @ilikesas no problem. I have done many a file- and directory manipulation in my life. What proved the most efficient, reliable and auditable (a big thing in the branches I work) is the following:

 

  • Design a dataset that keeps track of files and their properties, status in the process (new, done, error, ...) with corrsponding timestamps etc
  • Run a macro that generates a directory listing. Update above dataset and det4ermine what files need work (status new) based on whatever criteria you desire.
  • In a datastep you can eg. read files with names like fh* by using
    • a set statement of above dataset; WHERE status=new
    • Read the selected files using the INPUT/PUT statements and the FILEVAR option. This way you can have a data driven process of reading and writing files in a single datastep. You can also use CALL EXECUTE on each individual file to run macro's using the name as a parameter.
    • The datastep creates a table of files processed.
  • Use the generated dataset to update the dataset used to keep track of work done. Update status to whatever is next.

Many macros exist on the web that create directory listings. The gist of my suggestion is using the powerful FILEVAR option to cycle through a list of files. TS-DOC 581 gives a good idea of the possibilities.

 

Hope this helps,

- Jan.

Reeza
Super User

@ilikesas How is it a SAS question? You can wildcard the command and execute from SAS I suppose? It seems very OS related, witht he exception of calling the command from SAS.

 

If you're concatenating so you can read the files via SAS that's not required since you can use the FILEVAR option in an infile statement.

ilikesas
Barite | Level 11

Hi Reeza,

 

its true that just concateneating the files is purely OS (you actually showed me how to do it in my previous question!)

 

But here I need to dynamically concatenate based on file names and I guess that here SAS is the program that actually does the management of which files get concatenated together (and personally to me it seems to be the only way I can think of, but again I am a relative beginner and my knowledge is somewhat limited...)

 

Thanks! 

Reeza
Super User
data filelist ;
   infile 'dir /b *.txt' pipe truncover ;
   input filename $256. ;
run;

The snippet above (from @Tom) creates the  file list dataset. H

His code is correct, but you may want to start from the above to understand what's going on.

FreelanceReinh
Jade | Level 19

Hi @ilikesas,

 

If this is a recurring task, you could write a SAS macro which takes the common prefix (e.g. fh), the input and output folder names, the name of the output file and possibly instructions regarding the numbering as parameters. Then you could call the macro once for each set of files to be concatenated and it would build and execute the appropriate X statement.

ilikesas
Barite | Level 11

Hi,

 

Do I have first to import the names of the files? I know how to import files into SAs but here I don't need to imprt the actual files, here I use SAS as an intermediary sorter. Its just that I have difficulty starting and would greatly appreciate if you could give me some simple code as a hint.

 

 

Thanks!

Tom
Super User Tom
Super User

Do you care the order that the files are concatenated?  Are you just interested in concatenating the raw source files or are you actually interested in generating a concatenated DATASET?

 

If you just want to use SAS to copy the raw files together and you can use a simple (single wildcard) pattern to match the file names then a simple date step will do. 

data _null_;
   infile 'fh*.txt' ;
   file 'all_fh.txt' ;
   input;
   put _infile_;
run;

If it is more complex then make a dataset with the list of filenames and then use that to drive the step that reads and copies the files.

data filelist ;
   infile 'dir /b f*.txt' pipe truncover ;
   input filename $256. ;
   if filename =: 'fh' then do; 
     target = 'fh_files.txt';
     number = input(substr(scan(filename,1,'.'),3),32.);
     output;
   end;
   else if filename =: 'f' then do; 
     target = 'f_files.txt';
     number = input(substr(scan(filename,1,'.'),2),32.);
     output;
   end;
run;
proc sort ; by target number ; run;
data _null_;
   set filelist ;
   infile source filevar=filename end=eof ;
   file target filevar=target ;
   do while (not eof);
     input;
     put _infile_;
  end;
run;

   

 

Reeza
Super User
Wildcards in both SAS are valid as someone indicated.

Use the SCAN funtion within Toms code to extract just the filename.
ilikesas
Barite | Level 11

Hi Reeza,

 

here is the code that I did by modifying Tom's code and I ALMOST got what I wanted:

data filelist ;
   infile 'dir C:\files\ /b *.txt' pipe truncover ;
   input filename $256. ;
directory = 'C:\files\';
file_path=directory || filename; /*create the pathname */
name=substr(filename, 1, length(filename)-4); /*delete the .txt*/
file_number = compress(name,'','A'); /*extract the file number*/
unique_name = compress(name, file_number); /*extract the unique name*/
target_extension= '_all.txt';
target = directory || unique_name || target_extension;
run;


data _null_;
   set filelist ;
   infile source filevar=file_path end=eof ;
   file target filevar=target ;
   do while (not eof);
     input;
     put _infile_;
  end;
run;

The only minor problem is that when I execute the code I get my 2 files (one concatenating f1 and f2, and one concatenating fh1 and fh2) but they are not in txt format, to open them I actually need to choose the program with which to open.

 

its probably related to :

target_extension= '_all.txt';
target = directory || unique_name || target_extension;

 because when I looked at my SAS table "filelist" the target_extension is not fully concatenated to the other part, and I get something like:

C:\files\f

 

             _all.txt

 

for the variable "target"

Thanks!

Reeza
Super User

Don't use the double pipe for concatenation use Catt or Cats function. For one they remove extra spaces and another is they deal with conversion from numeric to character so you can avoid explicitly converting types. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 3499 views
  • 3 likes
  • 5 in conversation