BookmarkSubscribeRSS Feed
JVargas
Fluorite | Level 6

How can I import a raw csv file?

5 REPLIES 5
CodingDiSASter
Fluorite | Level 6

To import a raw CSV file to SAS, you can use a proc import function:

 

Proc Import

Datafile = "File Directory/File.csv"

Out = library.dataset

DBMS = csv

Replace;

run;

 

The datafile section needs to include the file path all the way to the file name while the out section lets you name a library for your dataset or create a temporary dataset. the 'replace' option is not necessary but if you'll need it if you want to overwrite a dataset with the same name as the dataset in the 'out' statement. 

 

Also, if you change the proc import, and re-run it or re-run it to show someone; the replace statement will prevent errors when you decide to re-run the code.

ballardw
Super User

Additional suggestion: Add the statement

Guessingrows=max;

Or if you file has more than 32000 rows maybe just 32000 instead of "max".

This will have the procedure examine more rows before setting variable properties like length of character values or type of variable.

 

By default Proc Import will only look at 20 rows to guess the type and may yield some undesirable results. If you have a sparse value, one that does not appear frequently, it is quite likely to get set as a one character value which may not be what you want.

 

Additionally, if you are going to read multiple files that should be of the same type you can be better off in the long run by writing/ using data step code to read them. Proc Import will generate data step code that you can copy from the log and paste into your editor and then modify after looking at your values. If you have values such as account numbers that might look like 00312455 then Import will assume that is a numeric value of 312455. If you want the leading zeroes you now have a minor headache. However you could modify the data step code by changing the Informat to $10. for example instead of the BEST32. likely used to get a character value with the leading 00.

 

The advantage of the data step is if you have multiple files you only need to change the Infile to point to the new file to read and change the output data set name. If you later need to combine the data then all the variables of the same names will have the same properties. Proc Import won't always allow that.

andreas_lds
Jade | Level 19

@JVargas wrote:

How can I import a raw csv file?


The best way: write a data step using the statements infile, length and input, maybe you need informat and format, too, this depends an the data you have. Types and length of the variables are defined by the description of the data you got, so that should not be a problem. The documentation of the mentioned statements contains examples, showing how to use them. You could, of course, start with proc import, as suggested by @CodingDiSASter, but i can't recommend using that procedure in production-code: i guesses types and lengths, almost always resulting in problems that could be avoided by using a data step.

morganmetzger
Calcite | Level 5

You can import CSV files by using an INFILE statement.

Ex: 

DATA Work.library;

INFILE "your-path-to-file.csv" DELIMITER = ',';

INPUT variables

RUN;

 

Since CSV means comma separated variables you use ',' as your delimiter. But technically the delimiter can be any character(s) or space(s).

Tom
Super User Tom
Super User

@morganmetzger wrote:

You can import CSV files by using an INFILE statement.

Ex: 

DATA Work.library;

INFILE "your-path-to-file.csv" DELIMITER = ',';

INPUT variables

RUN;

 

Since CSV means comma separated variables you use ',' as your delimiter. But technically the delimiter can be any character(s) or space(s).


The default behavior when you just change the DLM= option is that multiple delimiters are treated the same as one.  That is useful when the default delimiter is a space as usually people use extra spaces to make the fields line up vertically.  But in a CSV file that is not what you want.  In a CSV file two adjacent commas means the value for that field in the middle is empty.

 

Best to use the DSD option (which will imply that comma is the delimiter) because then both adjacent delimiters and quoted values are handled properly. 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 939 views
  • 1 like
  • 6 in conversation