I'm on SAS Enterprise Guide 7.11, SAS 9.4.
When trying to import a csv file, there's a line that disappears. It is the one mentioned in the error below.
(See block quote of this at the bottom of this question).
I've opened the file in Notepad++, but can't find anything wrong with the missing line.
Here's a screen shot placed at the missing line at column 283 (where SAS claims the error begins):
The code I'm using is
Data Want;
Length
'Nº de Operación de Venta C&C'n $ 29
"Estado Operación"n $ 46
"Código de Pedido"n 8
"Estado del Pedido"n $ 46
"Fecha de Pedido"n 8
"Hora de Pedido"n 8
"Referencia ECI"n $ 18
"Descripción Referencia ECI"n $ 40
"División"n $ 5
"Descripción División"n $ 20
"Subdivisión"n $ 7
"Descripción Subdivisión"n $ 20
"Método de Envío"n $ 16
"Fecha de Compromiso Original"n 8
"Fecha de Compromiso"n 8
"Importe Original"n $ 10
Importe $ 10
Unidades 8
"Unidades Pedidas"n 8;
Format
'Nº de Operación de Venta C&C'n $CHAR29.
"Estado Operación"n $CHAR46.
"Código de Pedido"n BEST13.
"Estado del Pedido"n $CHAR46.
"Fecha de Pedido"n Date9.
"Hora de Pedido"n TIME8.
"Referencia ECI"n $CHAR18.
"Descripción Referencia ECI"n $CHAR40.
"División"n $CHAR5.
"Descripción División"n $CHAR20.
"Subdivisión"n $CHAR7.
"Descripción Subdivisión"n $CHAR20.
"Método de Envío"n $CHAR16.
"Fecha de Compromiso Original"n DATE9.
"Fecha de Compromiso"n DATE9.
"Importe Original"n $CHAR10.
Importe $CHAR10.
Unidades Comma12.
"Unidades Pedidas"n Comma12.;
Informat
'Nº de Operación de Venta C&C'n $CHAR29.
"Estado Operación"n $CHAR46.
"Código de Pedido"n BEST13.
"Estado del Pedido"n $CHAR46.
"Fecha de Pedido"n Date9.
"Hora de Pedido"n TIME8.
"Referencia ECI"n $CHAR18.
"Descripción Referencia ECI"n $CHAR40.
"División"n $CHAR5.
"Descripción División"n $CHAR20.
"Subdivisión"n $CHAR7.
"Descripción Subdivisión"n $CHAR20.
"Método de Envío"n $CHAR16.
"Fecha de Compromiso Original"n DATE9.
"Fecha de Compromiso"n DATE9.
"Importe Original"n $CHAR10.
Importe $CHAR10.
Unidades Comma12.
"Unidades Pedidas"n Comma12.;
Infile "file_path_name"
Firstobs=2
LRECL=600
ENCODING="UTF-16"
TERMSTR=CRLF
DLM=","
MISSOVER
DSD;
Input
'Nº de Operación de Venta C&C'n : $CHAR29.
"Estado Operación"n : $CHAR46.
"Código de Pedido"n : BEST13.
"Estado del Pedido"n : $CHAR46.
"Fecha de Pedido"n : MMDDYY10.
"Hora de Pedido"n : TIME5.
"Referencia ECI"n : $CHAR18.
"Descripción Referencia ECI"n : $CHAR40.
"División"n : $CHAR5.
"Descripción División"n : $CHAR20.
"Subdivisión"n : $CHAR7.
"Descripción Subdivisión"n : $CHAR20.
"Método de Envío"n : $CHAR16.
"Fecha de Compromiso Original"n : YYMMDD10.
"Fecha de Compromiso"n : MMDDYY10.
"Importe Original"n : $CHAR10.
Importe : $CHAR10.
Unidades : Comma12.
"Unidades Pedidas"n : Comma12.;
Run;
NOTE: Invalid data for 'Unidades Pedidas'n in line 10524 283-318. REGLA: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 10524 "00401430761020520170823225656","Cancelled","1723549001754","Cancelled","8/23/2017","22:57 ","001002 101 535973537 ","SET PULSERAS 4 ","00102","COMPLEMENTOS ","0010258","B 201 ISUTERIA ","HomeDelivery","2017-09-07","9/7/2017","4.45 €","0.00 €","1","0".."004014307610 ZONE 4555454422222222222222466646667677222333323323322232323333222323328222323328222322232002333333333333 NUMR 93545291000000000002C28FD545C965292C22017D09D072C29F7F20172C24E45002C20E00002C212C202DA2004014307610 301 20520170823225656","Cancelled","1723549001754","Cancelled","8/23/2017","22:57 ","001016687233862042" 401 ,"CMST E/RE N/C COMB. LUREX ","00109","JOVEN ELLA ","0010919","MARCAS PRO.JOV 501 EN E ","HomeDelivery","2017-09-07","9/7/2017","2.99 €","0.00 €","1","0" 572 'Nº de Operación de Venta C&C'n=00401430761020520170823225656 'Estado Operación'n=Cancelled 'Código de Pedido'n=1723549001754 'Estado del Pedido'n=Cancelled 'Fecha de Pedido'n=23AUG2017 'Hora de Pedido'n=22:57:00 'Referencia ECI'n=001002535973537 'Descripción Referencia ECI'n=SET PULSERAS 4 'División'n=00102 'Descripción División'n=COMPLEMENTOS 'Subdivisión'n=0010258 'Descripción Subdivisión'n=BISUTERIA 'Método de Envío'n=HomeDelivery 'Fecha de Compromiso Original'n=07SEP2017 'Fecha de Compromiso'n=07SEP2017 'Importe Original'n=4.45 € Importe=0.00 € Unidades=1 'Unidades Pedidas'n=. _ERROR_=1 _N_=10523
PLEASE!!!!
post logs in the {i} subwindow, so horizontal spacing is preserved!
The main posting window mangles such content into uselessness by removing blanks etc.
If you are using PROC IMPORT , try option GUESSINGROW=max .
And post the code used to attempt that read.
The error you are getting is usually related to the informat associated with a variable. HOW that informat is specified can cause issues.
Also your picture shows a small vertical purple line. Did you add that? If not, it is likely whatever that represents is the issue.
I added the code. The vertical purple line isn't a character, it's the spot where I placed the cursor.
Thanks!
Looking at the line that it is failing on it is show way more that 19 items on one line.
It is also show end of line mark ('0D0A'x) in the middle of line.
Are you sure that data file is using UTF-16 encoding? If so it should NOT have '0D0A'x as that is an invalid 16 bit code.
At the circled point in the attached, instead of the "0" being followed by a comma, it is being followed by the hex characters "2080". Because of the lack of a comma, SAS is interpreting everything to the right as part of the field.
You need to get those two hex characters out of there.
Tom
How many lines does notepad+ see in the file?
Now count how many lines SAS sees if you don't add anything to the INFILE statement.
data _null_;
infile "file_path_name";
input;
run;
If they are the same then leave most of those other things off the INFILE statement. If you are running Unix and the carriage returns mess up the last field there are other ways to deal with it.
data test;
infile "file_path_name" dsd truncover firstobs=2;
length var1-var19 $50 ;
input var1-var19 ;
run;
If you want to see what is in file read it as binary and look.
data _null_;
infile "file_path_name" recfm=f lrecl=100 obs=20 encoding='any';
input;
list;
run;
I'm not sure, but I don't think so, I checked the "extra" lines specifically and they all end with LF and continue in the next line. I also checked and whenever it ends with LF, the data for that line isn't over yet, it carries onto to the next line and when it reaches the end of the data, I find a CRLF.
Everything seems to be working fine with these lines. What I'm asking about, as far as I can tell, is unrelated to this since every line the vicinity of the problematic line actually ends with CRLF.
Something else is happening. If your file is really in UTF-16 then perhaps it is somehow corrupted and it is confusing SAS?
If you look at the top of the file does it look like a UTF-16 file?
For example here is a simple program to create two lines each into a "normal" and a UTF-16 file.
filename temp temp ;
filename utf16 temp ;
data _null_;
set sashelp.class (obs=2) ;
file temp dsd termstr=crlf ;
put (_all_) (+0);
file utf16 dsd termstr=crlf encoding='utf-16' ;
put (_all_) (+0);
run;
If I then look at the hex codes in those files you can see that UTF-16 files take two bytes per character.
227 data _null_; 228 infile temp recfm=f lrecl=100 encoding='any'; 229 input; 230 list; 231 run; NOTE: The infile TEMP is: File Size (bytes)=42 RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9 1 CHAR Alfred,M,14,69,112.5..Alice,F,13,56.5,84.. 42 ZONE 466766242332332333230046666242332332323300 NUMR 1C6254CDC14C69C112E5DA1C935C6C13C56E5C84DA NOTE: 1 record was read from the infile TEMP. NOTE: DATA statement used (Total process time): real time 0.17 seconds cpu time 0.00 seconds 232 233 data _null_; 234 infile utf16 recfm=f lrecl=100 encoding='any'; 235 input; 236 list; 237 run; NOTE: The infile UTF16 is: File Size (bytes)=86 RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9 1 CHAR ÿþA.l.f.r.e.d.,.M.,.1.4.,.6.9.,.1.1.2...5.....A.l.i.c.e.,.F.,.1.3.,.5.6...5.,.8.4..... 86 ZONE FF406060706060204020303020303020303030203000004060606060204020303020303020302030300000 NUMR FE10C060205040C0D0C01040C06090C0101020E050D0A010C0903050C060C01030C05060E050C08040D0A0 NOTE: 1 record was read from the infile UTF16. NOTE: DATA statement used (Total process time): real time 0.39 seconds cpu time 0.01 seconds
Perhaps something has edited the file that did not understand it was UTF-16 and has inserted invalid 2-byte sequence that is confusing SAS?
No idea. I'm way out of my element here, trying to convey as much useful information as I can.
When I import the file manually, SAS is able to do it correctly, but it seems to preprocess the file first as the file in the infile statement is a .txt and it uses the enconding WLATIN1. If I try to import using this encoding I get an error and it says that 'A byte-order mark indicates that the data is encoded in "utf-16le"'.
How would I go about finding that 2-byte character that might be in the file? I can't find anything out of the ordinary when I open the file in Notepad++, nor on Notepad for that matter.
Plus, when I import the file in pandas (with the encoding UTF-16), it imports it fine as well.
I'm at a loss.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.