BookmarkSubscribeRSS Feed
Obsidian | Level 7


I know how to check the number of records on a CSV file once it has been read in to SAS, but am wondering if there is a shortcut where I could get at the # of records on the file without having to read it in first.



Super User

Do you know how many rows of header information may be involved?

Does the end of the CSV file have "empty" rows  consisting of nothing but commas ( a frequent behavior when converting spreadsheets to CSV)? 

Answers to these go to the question of "valid" observations versus lines of text in the file.


Can you tell us what you will do differently when you have that information?

Jade | Level 19

SAS treats csv files essentially as a stream of data, with each obs separated by a CR or CRLF, and typically terminated by an EOF (end of file indicator).   But unlike a SAS dataset, a CSV file does not require inclusion of such metadata as the number of obs (rows).  So there is no way for you to know with certainty how many records are in the CSV file without processing it.  


There can be exceptions of course.  If you know that all records have the same length - and you know (or can determine) what that length is, you can tell SAS to ask the operating system for the size of the csv file in bytes, then divide by the fixed length, in bytes, of each record to get the record count.  This is an increasingly unlikely scenario. 


Of course, if you can even make a good guess at the average record length, you can generate an equally good guess at the number of records using the CSV file size.  As an example:


%let csv_filename=c:\temp\export.csv;
%let expected_record_length=40;

data info;
   drop rc fid close;
   rc=filename('abc', "&csv_filename");

   filesize=input(finfo(fid,"File Size (bytes)"),best32.);
   put filesize= expected_nrecs=;


The argument "File Size (bytes)" is apparently valid in both Windows and Unix - I successfully tested it in Windows.


And, as @ballardw notes, you can improve the estimate of record counts it you know how many header records are in the CSV file.

The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

Super User Tom
Super User


A CSV file is text file with variable length lines.  The only way to know how many lines are in the file is to read the whole file.


But you don't have to actually do anything with the lines, other than count them.

data _null_;
  if eof then call symputx('num_lines', _n_-1);
  infile 'myfile.csv' end=eof;
%put Number of lines in myfile.csv is &num_lines..;

And you don't have to use SAS.  For example if you are running on Unix just use the wc command with the -l option to count the number of lines in the file.  If the file has a header row then the number of lines of actual data is one less than the number of lines.

Rhodochrosite | Level 12

Hi @Walternate 

If your SAS setup allows accessing the OS command, then you can use this macro on either Windows/Linux

%macro util_getFileLineCount(p_textFile=, p_rtrnMacVarName=, p_funkyString=\n);
	%LOCAL l_os l_firstObs;

	%let l_os = %substr(&sysscp,1,3);
	%let l_firstObs=1;

	%if (&l_os EQ LIN) %then
		FILENAME filesize pipe "wc -l &p_textFile";
	%else /*(&l_os EQ WIN)*/
		FILENAME filesize pipe "find /V /C ""&p_funkyString"" &p_textFile";
		%let l_firstObs=2;

		INFILE filesize FIRSTOBS=&l_firstObs END=eof;
	%if (&l_os EQ LIN) %then
		CALL SYMPUTX("&p_rtrnMacVarName",SCAN(_INFILE_,1,' '));
		CALL SYMPUTX("&p_rtrnMacVarName",SCAN(_INFILE_,-1,':'));
%mend util_getFileLineCount;

%global g_lineCount;
%util_getFileLineCount(p_textFile=%str(<path/filename.type>), p_rtrnMacVarName=g_lineCount, p_funkyString=\n);​/* Change \n to a different value that shouldn't be in the file */
%put &=g_lineCount;

Note: You can save the util_getFileLineCount macro into it's own file for re-use.


Hope this helps



Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg



Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1 like
  • 6 in conversation