BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
xinyao2019
Calcite | Level 5

Hi all,

 

i tried to read csv file to SAS.

codes are below:

proc import out=y.deathfull2019pre06242019csv
datafile= "Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv"
dbms=csv replace;
guessingrows=MAX; 
getnames=yes;
run;

The error is :

NOTE: Invalid data for Age_Type in line 1387 28-47.
NOTE: Invalid data for Date_of_Birth___Month in line 1387 55-55.

.

.

.

NOTE: Invalid data for Age_Type in line 5027 28-47.
NOTE: Invalid data for Date_of_Birth___Month in line 5027 55-55.
WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be
printed.

.

.

NOTE: 27840 records were read from the infile
'Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv'.
The minimum record length was 61.
The maximum record length was 2643.
NOTE: The data set Y.DEATHFULL2019PRE06242019CSV has 27840 observations and 375 variables.
NOTE: DATA statement used (Total process time):
real time 4.34 seconds
cpu time 2.28 seconds


Errors detected in submitted DATA step. Examine log.
27840 rows created in Y.DEATHFULL2019PRE06242019CSV from
Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv.

 

ERROR: Import unsuccessful. See SAS Log for details.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 2:08.43
cpu time 1:57.31

 

Any clue please

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

@Tom wrote:
Too bad. Looks like your lines do not end with CR+LF since it caused the file to have way too few lines.
Also it looks like there are SOME instances where CR+LF does occur since it didn't cause the file to have just one line.

Remove the embedded linefeeds in your original file so that it can be read as a CSV file by SAS.

See this discussion and solution. 

https://communities.sas.com/t5/SAS-Data-Management/Special-Carriages/m-p/214770/highlight/true#M5143

 

data _null_;
  infile 'sample_issues.csv' recfm=n ;
  file 'sample_issues_fixed.csv' recfm=n ;
  input x $char1. ;
  quotes+(x='"');
  if mod(quotes,2) and (x='0D'x or x='0A'x) then put '|';
  else put x $char1.;
run;

Now you can try using PROC IMPORT to read the new file.

Or write your own data step to read it.  For you TIME variable you could create your own informat or if you know that just the 99:99 values are the only invalid values you can use the ?? modifier in the INPUT statement to suppress the error messages they will call. SAS will set the values to missing.

View solution in original post

34 REPLIES 34
Reeza
Super User
for some reason PROC IMPORT isn't a good idea for your data. Write a data step instead. I'm surprised you got that error with the MAX setting though, can you post the full log.
xinyao2019
Calcite | Level 5

the log is too long exceed 200,000 characters.

i attached here.

thanks ! 

 

Reeza
Super User
Word will mess up the formatting and not everyone can download attachments. Paste the code into a code box.
xinyao2019
Calcite | Level 5

i am a beginner here. where is the code box? 

Tom
Super User Tom
Super User

@xinyao2019 wrote:

i am a beginner here. where is the code box? 


image.png

The first one looks like {i} and is the right one to use for logs.

The second one looks like the SAS run command icon.  It is good for SAS code as it will try to highlight the code using its best guess at what SAS syntax you have provided.

koyelghosh
Lapis Lazuli | Level 10

@Tom Thank you. I did not know this. Nice to know.

xinyao2019
Calcite | Level 5

i cannot paste the log into log box either: exceeding 200,000 characters .

 

 

koyelghosh
Lapis Lazuli | Level 10

Try to run the following and paste the output

 

DATA _NULL_;
	INFILE  "Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv" DELIMITER=",";
	IF _N_ IN (1386,1387,1388) THEN PUTLOG _ALL_;
RUN;
Tom
Super User Tom
Super User

@koyelghosh wrote:

Try to run the following and paste the output

 

DATA _NULL_;
	INFILE  "Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv" DELIMITER=",";
	IF _N_ IN (1386,1387,1388) THEN PUTLOG _ALL_;
RUN;

Simpler version:

DATA _NULL_;
  INFILE  "Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv" 
    firstobs=1386 obs=1388
  ;
  input;
  list;
RUN;
koyelghosh
Lapis Lazuli | Level 10

@Tom Certainly. You don't need to read 1385 lines and waste time and memory. Your comment was elegant.

xinyao2019
Calcite | Level 5

Thank you so much to remind me the contents in the log. i did not realized it.

this is what i got. 

DATA _NULL_;
50716 INFILE "Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv"
50717 firstobs=1386 obs=1388
50718 ;
50719 input;
50720 list;
50721 RUN;

NOTE: The infile "Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv" is:
Filename=Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv,
RECFM=V,LRECL=32767,File Size (bytes)=44058484,
Last Modified=24Jun2019:08:31:06,
Create Time=07Jan2019:09:11:01

RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----
1386 2019001462,aaaaaaaa,bbbbbb,sssssss, ,F,1,00,00,01/01/1900,01,01,1900,555-55-5555,01/
00 01/2019,50,20,01/01/2019,1,1,2019,A,10:30,A,aaaaaaa pppppppppp,99999, ,, ,99,XX,PHIL
169 iiiiiii,RP,40, ,0,2915 S ffffffffff ST ,sssssss,63e3,Kkkk,1701,17,33,717,WASHINGTON,
253 48,UNITED STATES,99999,0,810,DECEDENT'S HOME,N,M,0,FLORENDO,Y,INOVEJAS, ,6,9,0,0,SUP
337 PLY ROOM,0,IIIIIII HOSPITAL,PPPPPPPP, ,SSSSSSS, ,TTTTTTT, ,CCCC,FFFFFFFF,Y,NNNNNNNN,
421 ,"2222 S FFFFFFFFFF ST, EEEEEEE, WA, 99999",HHHHHHH,N,N,N,N,N,Y,N,N,N,N,N,N,N,N,N,
505 , , ,, ,, , ,420, , ,,,,,, , ,,,,, ,,6,13,7,Y,N,N,N,N, ,100,,2915 S FFFFFFFFFF ST ,S
589 SSSSSS,63e3,Y,KKKK,1701,17,33,717,WASHINGTON,WA,48,UNITED STATES,US,99999,,,,,,,,,,0
673 ,0,0,0, ,,Y,50,KKKKKKKKK,KKKKKK,KKKKKKKKK KKKKKK,1111,EEEEEEEEE-WASHELLI NNNNNNN HO
757 AA,AAAAAAA,WASHINGTON,99999,E,AAAAAAAAAA,01/01/1900,1,01,2019,9999,AAAAAAAAA AAAAAAA
841 L AAAA,AAAAAAA,WASHINGTON,UNITED STATES,AAAAAAA,AAAAAAA,"AAAAAAA A. AAAAAAA, JR, MD"
925 ,6044 AAA JR AAA AAA. 111 ,AAAAAAA,WA,99999,MD11111111,1,01/10/1900, , ,9, ,N,1,2,3,
1009 4,,,,,,,,,,,,,,,,,1,1,1,1,,,,,,,,,,,,,,,,,A419,I48,I499,J969, , , , , , , , , , , ,,
1093 ,,,, , , , , , , , , , , , , , ,,,,,,,I48, ,2019,00400001,I48,A419,J969, , , , , , ,
1177 , , , , , ,,,,,,I48,AAAAAA AAAAAA,AAAAAA AAAAAAAAAAAAA,AAAAAAA ARRHYTHMIA,RESPIRATOR
1261 Y FAILURE,SEVERE AAAAAA,AAAAAA FIBRILLATION,AAAAAAA ARRHYTHMIA,RESPIRATORY AAAAAAA,
1345 , ,1 HR,1 HR,1 HR,1 HR,1 HR,1 HR,1 HR,1 HR,N,N,N,N,N,X,8,0,U,0,0,N,N, ,,,, ,99:99, ,
1429 M, , , , ,0,0, ,, ,, ,U, , , , ,AAAAA AAAAA, ,01/14/1900, ,01/14/1900,0,298366,"QR 0
1513 2/20/2019 1521
1387 DA19-00512",,,1,20190304,1,2019-01-16T02:06:20Z,0,1,N,N,N,N,Y 61
1388 2019001463,AAAA,AAAAAAAA,AAAA, ,F,1,90,90,08/12/1928,8,12,1928,269-24-9932,01/14/201
85 9,239,20,01/13/2019,1,01,1900,A,04:14,AAAAAAA,99999, ,,AAAA,36,AA,UNITED STATES,US
169 ,40, ,0, ,AAAAAAA,67e3,AAAAAAA,3201,32,63,732,WASHINGTON,00,UNITED STATES,00000,4,16
253 2,INPATIENT,N,W,0, , , , ,3,9,0,0,AAAAAAAAA,0,AAA AAAA,AAAAAAAA,G,AAAAAA, ,AAAAAAA,
337 ,AAAAAAAAA,AAAA, ,AAAAAAA, ,"20709 S. AAAAA RD. AAAAA, WA 00000",DAUGHTER,Y,N,N,N,
421 N,N,N,N,N,N,N,N,N,N,N, , , ,, ,, , ,100, , ,,,,,, , ,,,,, ,,1,10,1,Y,N,N,N,N, ,100,,
505 00000 S. AAAAA RD. ,AAAAAA,00000,N,AAAAAAA,3204,32,63,732,WASHINGTON,WA,48,UNITED ST
589 AAAA,US,00000,,,,,,,,,,0,0,0,0, ,,M,9,AAAAAAA,AAAAAA,AAAAAAA R AAAAAA,000 ,AAAAAA FU
673 AAAAA AAAAAA,AAAAAA,WASHINGTON,00000,R,AAAAAAA AAAAAAASTATE,01/22/2019,1,22,2019,9999,
757 AAAAAAA AAAAAAAA AAAA,AAAAAAA,AAAAAAAAAA,UNITED STATES,AAAAAA,AAAAAA,"AAAAAA L. AAAA
841 LY, DO",000 A. 8AA AAA AAAA: 9A AAAAAAAAAAAA ,AAAAAAA,WA,00000,AA00000000,3,01/14/2
925 019, , ,9, ,N,1, ,,,,,,,,,,,,,,,,,,,1,,,,,,,,,,,,,,,,,,,,I64, , , , , , , , , , , ,
1009 , , ,,,,,, , , , , , , , , , , , , , ,,,,,,,I64, ,0000,00400001,I64, , , , , , , , ,
1093 , , , , , ,,,,,,I64,AAAAA AAAAAA AAAAAAAA AAAAAA AAAAAA, , , ,AAAAA AAAAAA AAAAAAAA
1177 AAAAAA AAAAAA, , , , , ,AAAA, , , ,AAAA, , , , , , , ,N,X,8,0,U,0,,N, , ,,,, ,99:99
1261 , , , , , , ,0,0, ,, ,, ,U, , , , ,AAAAAAA AAAAAA, ,01/00/0000,0100,01/00/0000,0,000
1345 704, ,,,1,20190131,1,2019-01-16T02:06:20Z,1,1,N,N,N,N,N 0000
NOTE: 3 records were read from the infile
"Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv".
The minimum record length was 61.
The maximum record length was 1521.
NOTE: DATA statement used (Total process time):
real time 0.20 seconds
cpu time 0.03 seconds

 

Reeza
Super User
We don't need the full log, just the first 10 error or so. BUT - look at the log first because it can contain your data so if your data is confidential you'll have issues.
ballardw
Super User

The invalid data notes almost certainly mean that SAS has decided that the variables were intended to be one form of data, possibly  SAS dates and applying a SAS informat or numeric to read the values as such. But some of your data is not actually matching that expected format. You can open the CSV file in a text editor and go to line 1387 and look at the text values. The error is

 

I suspect you will find something like NA or NULL where a number is expected. If the value is actually treated as a date by SAS then it may mean an incompatible value such as a month out of range 1 to 12 or day of month out of 1 to 28,29,30 or 31 depending on month and year.

 

If the data step code was successfully created it may mean that you can address the issue by copying the code to the editor and modifying the data step such as providing a "better" informat or adding code to address unexpected values.

 

Or fix the data issues in the CSV which is usually way more difficult.

 

It might help to post the generated data step code and all the messages from log. Copy from the log and paste into a code box opened with the forum's {I} or "running man" icons. I am afraid that you may have removed some of the critical diagnostic information.

xinyao2019
Calcite | Level 5

i contacted the person who manages the data and was told he did not find anything that looks strange in the csv file and it can be read into R without any problem. 

Thanks 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 34 replies
  • 10513 views
  • 8 likes
  • 5 in conversation