Hi,
In a previous question I asked how to concatenate files and from the answers given realized that it can be done directly with command prompt with : type "file1" "file2" ... "filen" > "allfiles"
But suppose that I want to concatenate files based on their names. For example, I have the files f1,f2, fh1, fh2, is it possible to dynamically concatenate all the f's together and the fh's together etc?
Thank you!
Do you care the order that the files are concatenated? Are you just interested in concatenating the raw source files or are you actually interested in generating a concatenated DATASET?
If you just want to use SAS to copy the raw files together and you can use a simple (single wildcard) pattern to match the file names then a simple date step will do.
data _null_;
infile 'fh*.txt' ;
file 'all_fh.txt' ;
input;
put _infile_;
run;
If it is more complex then make a dataset with the list of filenames and then use that to drive the step that reads and copies the files.
data filelist ;
infile 'dir /b f*.txt' pipe truncover ;
input filename $256. ;
if filename =: 'fh' then do;
target = 'fh_files.txt';
number = input(substr(scan(filename,1,'.'),3),32.);
output;
end;
else if filename =: 'f' then do;
target = 'f_files.txt';
number = input(substr(scan(filename,1,'.'),2),32.);
output;
end;
run;
proc sort ; by target number ; run;
data _null_;
set filelist ;
infile source filevar=filename end=eof ;
file target filevar=target ;
do while (not eof);
input;
put _infile_;
end;
run;
How about "type fh*.* > allfhfiles". But don't you have a SAS question? Those are more fun.
- Jan.
Hi Jan,
Actuallt it is a very SASy question because here I realize that I will need some sort of a macro but I can't figure out how to write and would greatly appreciate a first push!!
thanks!
Hi @ilikesas no problem. I have done many a file- and directory manipulation in my life. What proved the most efficient, reliable and auditable (a big thing in the branches I work) is the following:
Many macros exist on the web that create directory listings. The gist of my suggestion is using the powerful FILEVAR option to cycle through a list of files. TS-DOC 581 gives a good idea of the possibilities.
Hope this helps,
- Jan.
@ilikesas How is it a SAS question? You can wildcard the command and execute from SAS I suppose? It seems very OS related, witht he exception of calling the command from SAS.
If you're concatenating so you can read the files via SAS that's not required since you can use the FILEVAR option in an infile statement.
Hi Reeza,
its true that just concateneating the files is purely OS (you actually showed me how to do it in my previous question!)
But here I need to dynamically concatenate based on file names and I guess that here SAS is the program that actually does the management of which files get concatenated together (and personally to me it seems to be the only way I can think of, but again I am a relative beginner and my knowledge is somewhat limited...)
Thanks!
data filelist ;
infile 'dir /b *.txt' pipe truncover ;
input filename $256. ;
run;
The snippet above (from @Tom) creates the file list dataset. H
His code is correct, but you may want to start from the above to understand what's going on.
Hi @ilikesas,
If this is a recurring task, you could write a SAS macro which takes the common prefix (e.g. fh), the input and output folder names, the name of the output file and possibly instructions regarding the numbering as parameters. Then you could call the macro once for each set of files to be concatenated and it would build and execute the appropriate X statement.
Hi,
Do I have first to import the names of the files? I know how to import files into SAs but here I don't need to imprt the actual files, here I use SAS as an intermediary sorter. Its just that I have difficulty starting and would greatly appreciate if you could give me some simple code as a hint.
Thanks!
Do you care the order that the files are concatenated? Are you just interested in concatenating the raw source files or are you actually interested in generating a concatenated DATASET?
If you just want to use SAS to copy the raw files together and you can use a simple (single wildcard) pattern to match the file names then a simple date step will do.
data _null_;
infile 'fh*.txt' ;
file 'all_fh.txt' ;
input;
put _infile_;
run;
If it is more complex then make a dataset with the list of filenames and then use that to drive the step that reads and copies the files.
data filelist ;
infile 'dir /b f*.txt' pipe truncover ;
input filename $256. ;
if filename =: 'fh' then do;
target = 'fh_files.txt';
number = input(substr(scan(filename,1,'.'),3),32.);
output;
end;
else if filename =: 'f' then do;
target = 'f_files.txt';
number = input(substr(scan(filename,1,'.'),2),32.);
output;
end;
run;
proc sort ; by target number ; run;
data _null_;
set filelist ;
infile source filevar=filename end=eof ;
file target filevar=target ;
do while (not eof);
input;
put _infile_;
end;
run;
Hi Reeza,
here is the code that I did by modifying Tom's code and I ALMOST got what I wanted:
data filelist ;
infile 'dir C:\files\ /b *.txt' pipe truncover ;
input filename $256. ;
directory = 'C:\files\';
file_path=directory || filename; /*create the pathname */
name=substr(filename, 1, length(filename)-4); /*delete the .txt*/
file_number = compress(name,'','A'); /*extract the file number*/
unique_name = compress(name, file_number); /*extract the unique name*/
target_extension= '_all.txt';
target = directory || unique_name || target_extension;
run;
data _null_;
set filelist ;
infile source filevar=file_path end=eof ;
file target filevar=target ;
do while (not eof);
input;
put _infile_;
end;
run;
The only minor problem is that when I execute the code I get my 2 files (one concatenating f1 and f2, and one concatenating fh1 and fh2) but they are not in txt format, to open them I actually need to choose the program with which to open.
its probably related to :
target_extension= '_all.txt';
target = directory || unique_name || target_extension;
because when I looked at my SAS table "filelist" the target_extension is not fully concatenated to the other part, and I get something like:
C:\files\f
_all.txt
for the variable "target"
Thanks!
Don't use the double pipe for concatenation use Catt or Cats function. For one they remove extra spaces and another is they deal with conversion from numeric to character so you can avoid explicitly converting types.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.