Hi all,
i tried to read csv file to SAS.
codes are below:
proc import out=y.deathfull2019pre06242019csv
datafile= "Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv"
dbms=csv replace;
guessingrows=MAX;
getnames=yes;
run;
The error is :
NOTE: Invalid data for Age_Type in line 1387 28-47.
NOTE: Invalid data for Date_of_Birth___Month in line 1387 55-55.
.
.
.
NOTE: Invalid data for Age_Type in line 5027 28-47.
NOTE: Invalid data for Date_of_Birth___Month in line 5027 55-55.
WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be
printed.
.
.
NOTE: 27840 records were read from the infile
'Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv'.
The minimum record length was 61.
The maximum record length was 2643.
NOTE: The data set Y.DEATHFULL2019PRE06242019CSV has 27840 observations and 375 variables.
NOTE: DATA statement used (Total process time):
real time 4.34 seconds
cpu time 2.28 seconds
Errors detected in submitted DATA step. Examine log.
27840 rows created in Y.DEATHFULL2019PRE06242019CSV from
Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv.
ERROR: Import unsuccessful. See SAS Log for details.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 2:08.43
cpu time 1:57.31
Any clue please
@Tom wrote:
Too bad. Looks like your lines do not end with CR+LF since it caused the file to have way too few lines.
Also it looks like there are SOME instances where CR+LF does occur since it didn't cause the file to have just one line.
Remove the embedded linefeeds in your original file so that it can be read as a CSV file by SAS.
See this discussion and solution.
https://communities.sas.com/t5/SAS-Data-Management/Special-Carriages/m-p/214770/highlight/true#M5143
data _null_;
infile 'sample_issues.csv' recfm=n ;
file 'sample_issues_fixed.csv' recfm=n ;
input x $char1. ;
quotes+(x='"');
if mod(quotes,2) and (x='0D'x or x='0A'x) then put '|';
else put x $char1.;
run;
Now you can try using PROC IMPORT to read the new file.
Or write your own data step to read it. For you TIME variable you could create your own informat or if you know that just the 99:99 values are the only invalid values you can use the ?? modifier in the INPUT statement to suppress the error messages they will call. SAS will set the values to missing.
the log is too long exceed 200,000 characters.
i attached here.
thanks !
i am a beginner here. where is the code box?
@xinyao2019 wrote:
i am a beginner here. where is the code box?
The first one looks like {i} and is the right one to use for logs.
The second one looks like the SAS run command icon. It is good for SAS code as it will try to highlight the code using its best guess at what SAS syntax you have provided.
@Tom Thank you. I did not know this. Nice to know.
i cannot paste the log into log box either: exceeding 200,000 characters .
Try to run the following and paste the output
DATA _NULL_;
INFILE "Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv" DELIMITER=",";
IF _N_ IN (1386,1387,1388) THEN PUTLOG _ALL_;
RUN;
@koyelghosh wrote:
Try to run the following and paste the output
DATA _NULL_; INFILE "Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv" DELIMITER=","; IF _N_ IN (1386,1387,1388) THEN PUTLOG _ALL_; RUN;
Simpler version:
DATA _NULL_;
INFILE "Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv"
firstobs=1386 obs=1388
;
input;
list;
RUN;
@Tom Certainly. You don't need to read 1385 lines and waste time and memory. Your comment was elegant.
Thank you so much to remind me the contents in the log. i did not realized it.
this is what i got.
DATA _NULL_;
50716 INFILE "Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv"
50717 firstobs=1386 obs=1388
50718 ;
50719 input;
50720 list;
50721 RUN;
NOTE: The infile "Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv" is:
Filename=Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv,
RECFM=V,LRECL=32767,File Size (bytes)=44058484,
Last Modified=24Jun2019:08:31:06,
Create Time=07Jan2019:09:11:01
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----
1386 2019001462,aaaaaaaa,bbbbbb,sssssss, ,F,1,00,00,01/01/1900,01,01,1900,555-55-5555,01/
00 01/2019,50,20,01/01/2019,1,1,2019,A,10:30,A,aaaaaaa pppppppppp,99999, ,, ,99,XX,PHIL
169 iiiiiii,RP,40, ,0,2915 S ffffffffff ST ,sssssss,63e3,Kkkk,1701,17,33,717,WASHINGTON,
253 48,UNITED STATES,99999,0,810,DECEDENT'S HOME,N,M,0,FLORENDO,Y,INOVEJAS, ,6,9,0,0,SUP
337 PLY ROOM,0,IIIIIII HOSPITAL,PPPPPPPP, ,SSSSSSS, ,TTTTTTT, ,CCCC,FFFFFFFF,Y,NNNNNNNN,
421 ,"2222 S FFFFFFFFFF ST, EEEEEEE, WA, 99999",HHHHHHH,N,N,N,N,N,Y,N,N,N,N,N,N,N,N,N,
505 , , ,, ,, , ,420, , ,,,,,, , ,,,,, ,,6,13,7,Y,N,N,N,N, ,100,,2915 S FFFFFFFFFF ST ,S
589 SSSSSS,63e3,Y,KKKK,1701,17,33,717,WASHINGTON,WA,48,UNITED STATES,US,99999,,,,,,,,,,0
673 ,0,0,0, ,,Y,50,KKKKKKKKK,KKKKKK,KKKKKKKKK KKKKKK,1111,EEEEEEEEE-WASHELLI NNNNNNN HO
757 AA,AAAAAAA,WASHINGTON,99999,E,AAAAAAAAAA,01/01/1900,1,01,2019,9999,AAAAAAAAA AAAAAAA
841 L AAAA,AAAAAAA,WASHINGTON,UNITED STATES,AAAAAAA,AAAAAAA,"AAAAAAA A. AAAAAAA, JR, MD"
925 ,6044 AAA JR AAA AAA. 111 ,AAAAAAA,WA,99999,MD11111111,1,01/10/1900, , ,9, ,N,1,2,3,
1009 4,,,,,,,,,,,,,,,,,1,1,1,1,,,,,,,,,,,,,,,,,A419,I48,I499,J969, , , , , , , , , , , ,,
1093 ,,,, , , , , , , , , , , , , , ,,,,,,,I48, ,2019,00400001,I48,A419,J969, , , , , , ,
1177 , , , , , ,,,,,,I48,AAAAAA AAAAAA,AAAAAA AAAAAAAAAAAAA,AAAAAAA ARRHYTHMIA,RESPIRATOR
1261 Y FAILURE,SEVERE AAAAAA,AAAAAA FIBRILLATION,AAAAAAA ARRHYTHMIA,RESPIRATORY AAAAAAA,
1345 , ,1 HR,1 HR,1 HR,1 HR,1 HR,1 HR,1 HR,1 HR,N,N,N,N,N,X,8,0,U,0,0,N,N, ,,,, ,99:99, ,
1429 M, , , , ,0,0, ,, ,, ,U, , , , ,AAAAA AAAAA, ,01/14/1900, ,01/14/1900,0,298366,"QR 0
1513 2/20/2019 1521
1387 DA19-00512",,,1,20190304,1,2019-01-16T02:06:20Z,0,1,N,N,N,N,Y 61
1388 2019001463,AAAA,AAAAAAAA,AAAA, ,F,1,90,90,08/12/1928,8,12,1928,269-24-9932,01/14/201
85 9,239,20,01/13/2019,1,01,1900,A,04:14,AAAAAAA,99999, ,,AAAA,36,AA,UNITED STATES,US
169 ,40, ,0, ,AAAAAAA,67e3,AAAAAAA,3201,32,63,732,WASHINGTON,00,UNITED STATES,00000,4,16
253 2,INPATIENT,N,W,0, , , , ,3,9,0,0,AAAAAAAAA,0,AAA AAAA,AAAAAAAA,G,AAAAAA, ,AAAAAAA,
337 ,AAAAAAAAA,AAAA, ,AAAAAAA, ,"20709 S. AAAAA RD. AAAAA, WA 00000",DAUGHTER,Y,N,N,N,
421 N,N,N,N,N,N,N,N,N,N,N, , , ,, ,, , ,100, , ,,,,,, , ,,,,, ,,1,10,1,Y,N,N,N,N, ,100,,
505 00000 S. AAAAA RD. ,AAAAAA,00000,N,AAAAAAA,3204,32,63,732,WASHINGTON,WA,48,UNITED ST
589 AAAA,US,00000,,,,,,,,,,0,0,0,0, ,,M,9,AAAAAAA,AAAAAA,AAAAAAA R AAAAAA,000 ,AAAAAA FU
673 AAAAA AAAAAA,AAAAAA,WASHINGTON,00000,R,AAAAAAA AAAAAAASTATE,01/22/2019,1,22,2019,9999,
757 AAAAAAA AAAAAAAA AAAA,AAAAAAA,AAAAAAAAAA,UNITED STATES,AAAAAA,AAAAAA,"AAAAAA L. AAAA
841 LY, DO",000 A. 8AA AAA AAAA: 9A AAAAAAAAAAAA ,AAAAAAA,WA,00000,AA00000000,3,01/14/2
925 019, , ,9, ,N,1, ,,,,,,,,,,,,,,,,,,,1,,,,,,,,,,,,,,,,,,,,I64, , , , , , , , , , , ,
1009 , , ,,,,,, , , , , , , , , , , , , , ,,,,,,,I64, ,0000,00400001,I64, , , , , , , , ,
1093 , , , , , ,,,,,,I64,AAAAA AAAAAA AAAAAAAA AAAAAA AAAAAA, , , ,AAAAA AAAAAA AAAAAAAA
1177 AAAAAA AAAAAA, , , , , ,AAAA, , , ,AAAA, , , , , , , ,N,X,8,0,U,0,,N, , ,,,, ,99:99
1261 , , , , , , ,0,0, ,, ,, ,U, , , , ,AAAAAAA AAAAAA, ,01/00/0000,0100,01/00/0000,0,000
1345 704, ,,,1,20190131,1,2019-01-16T02:06:20Z,1,1,N,N,N,N,N 0000
NOTE: 3 records were read from the infile
"Y:\Datasets\Restricted\Death\Preliminary\deathfull2019pre.csv".
The minimum record length was 61.
The maximum record length was 1521.
NOTE: DATA statement used (Total process time):
real time 0.20 seconds
cpu time 0.03 seconds
The invalid data notes almost certainly mean that SAS has decided that the variables were intended to be one form of data, possibly SAS dates and applying a SAS informat or numeric to read the values as such. But some of your data is not actually matching that expected format. You can open the CSV file in a text editor and go to line 1387 and look at the text values. The error is
I suspect you will find something like NA or NULL where a number is expected. If the value is actually treated as a date by SAS then it may mean an incompatible value such as a month out of range 1 to 12 or day of month out of 1 to 28,29,30 or 31 depending on month and year.
If the data step code was successfully created it may mean that you can address the issue by copying the code to the editor and modifying the data step such as providing a "better" informat or adding code to address unexpected values.
Or fix the data issues in the CSV which is usually way more difficult.
It might help to post the generated data step code and all the messages from log. Copy from the log and paste into a code box opened with the forum's {I} or "running man" icons. I am afraid that you may have removed some of the critical diagnostic information.
i contacted the person who manages the data and was told he did not find anything that looks strange in the csv file and it can be read into R without any problem.
Thanks
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.