Folks,
I have 13 individual csv files which in total come to nearly 40GB in size. I'm looking for tips in terms of getting efficiencies in reading them to SAS.
Originally I had code which would list the 13 csv files and their location and a macro which would cycle through each one and use proc import to read them in and output them to as a SAS file.
However, I'm wondering is it best to this or to use an infile statement as I know the position of all the columns and their lengths.
My macro is handy as it uses guessing rows = max but I imagine with such large files now it will take ages to run.
Perhaps an infile method would be better?
INFILE is definitely more efficient, especially since you know the schema for these files. And if all files share the same layout, you can do this in a single DATA step. See this article for how to read multiple text files at once.
Furthermore, you can zip that CSV files to save some space and than use the "INFILE ZIP" to read directly from the zipped file.
Bart
Is the list of files static? Perhaps just create a fileref that points to all of the files and read from that.
If you need to skip header rows test for when you are starting a new file.
filename all
('/dir/file1.csv'
'/dir/file2.csv'
...
'/dir/file13.csv'
);
data want;
length fname $256 ;
infile all dsd truncover filename=fname;
input @;
if fname ne lag(fname) then input;
* code to define variables and input a record ;
run;
Or it you just want to read all of the files whose names follow a simple pattern then use a * wildcard in the physical filename in the INFILE statement.
data want;
length fname $256 ;
infile '/dir/file*.csv' dsd truncover filename=fname;
input @;
if fname ne lag(fname) then input;
* code to define variables and input a record ;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.