Help using Base SAS procedures

Newlines in CSV file see to fail with PROC IMPORT

Reply
Occasional Contributor
Posts: 8

Newlines in CSV file see to fail with PROC IMPORT

I am running PROC IMPORT on a CSV file. My CSV file opens fine in MS Excel. However, the newline characters inside the quoted cells seem to cause PROC IMPORT to do the wrong thing. PROC IMPORT is seeing an end of line and assuming "next record".

Any suggestions?

Here is the code I am running
PROC IMPORT OUT= WORK.PAPER3
DATAFILE= "C:\Documents and Settings\jleveille\Desktop\paperauthor.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
Valued Guide
Posts: 2,111

Re: Newlines in CSV file see to fail with PROC IMPORT

Another reason to hate Excel for data! I've usually ended up going through the excel file and manually deleting the offending linefeeds. It would be nice if there were an easier way.
Occasional Contributor
Posts: 8

Re: Newlines in CSV file see to fail with PROC IMPORT

Uh, I think you have misunderstood me. The newlines are part of the data. They should be in there. No reason to hate Microsoft here. PROC IMPORT is doing the wrong thing. The quoted string does not end, but the record does because of the newline. The quoted field carrys over to the next record in the file. When I open the CSV in Microsoft it opens properly.

Here is an example of the CSV file

Name, Address
"Bob", "123 Sycamore Ln"
"Fred", "200 Main St
Apt 200"

Notice that the second address contains a line break. PROC IMPORT chokes on this.

As for the solution from Mike, I realize that I can do what you describe to open and save, however, this presents another problem. MS Excel assumes that the CSV is ASCII data which it isn't. It is a UTF-8 file. So opening CSV and saving to XLS causes transcoding of characters such as the registered trademark symbol. Ouch.
N/A
Posts: 0

Re: Newlines in CSV file see to fail with PROC IMPORT

it would be nice if the TERMSTR infile option were to take affect within the CSV file import.
Just by it's name the option value TERMSTR=CRLF implies that within a cell, the embedded newline character (just the LF i.e. 0Ax on its own) should not be treated as the TERMSTR. I understand that a defect has been recognised for SAS9.2.

Perhaps it will become easier to deal with this problem once the defect is fixed.

PeterC
Highlighted
Super Contributor
Posts: 273

Re: Newlines in CSV file see to fail with PROC IMPORT

I tried upon this contents (a french one saved in utf8) but showed here in windows
[PRE]
Nom,Adressé
"Bob","cité Bergère"
"Ann","impasse
Dôme"
[/PRE]
with this code under sas windows
[PRE]
options validvarname=any;
filename a "z:\utf8.csv" encoding=utf8;
PROC IMPORT OUT= WORK.PAPER3
DATAFILE=a
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
[/PRE]
the results are
[PRE]
Nom Adressé
Bob cité Bergère
Ann "impasse
Dôme"
[/PRE]

But as soon as i have a third variable the solution is going down!

I think the only sas solution is to
-read each line of the external file
-suppress the line break
and rewrite it

Andre
Contributor
Posts: 23

Sometimes Microsoft can be your friend

For once, don't blame Microsoft -- the CSV file may not have been created by a Microsoft product! ;-)

And, since Excel can read the CSV file correctly, it actually makes it easy to get it into SAS. Just open the CSV file in Excel and save it as an Excel file (XLS). Then run PROC IMPORT on the converted file.
Ask a Question
Discussion stats
  • 5 replies
  • 1362 views
  • 0 likes
  • 5 in conversation