BookmarkSubscribeRSS Feed
Calcite | Level 5
I have a large csv file. I only want to load a 10,000 rows at a time. I don’t know how exactly how large it is, but it is over 4 million rows. I played around with proc import but I can’t get that to work for a range of rows.
Super User Tom
Super User

You don't need to use PROC IMPORT to read a CSV file.  Those are just simple TEXT files.  You can just write your own data step to read them.  


You do know what is in the file right?  or is someone expecting you to GUESS what it contains?  Seems like strange request for a file with that many rows.


If you are forced to guess how to read it you might want to use this macro instead.


In addition to working for some files that PROC IMPORT cannot handle it will also let you use a random sample of the data rows to use to make the guessing of how to read it faster.


It also writes cleaner data step code to read the file.  And you can ask it to save the generated code to a file for you.  Which will make it easier for you to use that code as your starting point for reading only a few of the rows, if you really still need to do that.


To read only some of the observations from a text file use the FIRSTOBS= and OBS= options on the INFILE statement.

data part1;
  infile 'myfile.csv' dsd truncover firstobs=2 obs=10001;

data part2;
  infile 'myfile.csv' dsd truncover firstobs=10002 obs=20001;


Meteorite | Level 14

In case you want to use only SAS then then the approach by @Tom is what one needs to follow.
However in case you have access to bash shell, the large file can be split into a number of smaller  files having 10,000 lines each.
The command would be as follows. Please do test it.

 split -l 10000 --numeric-suffixes input_filename output_prefix

You will get out put  files with the name output_prefix01, output_prefix02.....
The next step would be to write a code to read these files one file at a time.

Super User

Proc import is slow, so if you read it in using a data step it's quite easy. 4 million isn't much for SAS to process at all.



Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg



Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1 like
  • 4 in conversation