Hello.
I'm having an issue with importing a CSV file (I believe queried and given to me from a big database) into SAS when every string is in quotes as listed below
( "File","O,H,H","Yes")
However, if I open the CSV in excel and save over it, it results in the follow:
(File,"O,H,H",Yes)
and this imports just fine.
Is there a way to import without having excel involved?
Code:
Data test;
infile 'C:\Users\Daniel\Documents\Line.csv'
TERMSTR=CRLF
firstobs=4
dlm=','
dsd
missover
LRECL=6100
;
input
Case: $8.
Narrative: $4030.
Indication: $575.
Country: $4.
Report_Source: $12.
Image: $31.
Attach: $15.
;
run;
what is the issue in when importing directly from csv?
A dataset with 20,000 observation is only 19 observations long and fields with incorrect information in them.
My first thought is a different termstr value that is getting replaced when opened/saved in Excel.
Is there any other value other than: CRLF, CR, and LF?
Tried all three and none worked. Perhaps the CSV file given to me is in a format SAS cannot read?
It's certainly possible to construct a file that uses something else. At times, I've even had to open the file in a byte editor to look at what's actually there.
I see.
The only question I have left is, which is really just asking the question again:
Can SAS import a CSV where all the strings are in quotation marks.
ex.
"103943","O,H,J","File","Patient blah, blah, blah"
That should not be an issue. You can prove that by creating a sample file and trying it.
Sorry, remove the dlm=',', the dad option should cover that.
Also, you usually need truncover instead of missover.
Thank you for your input.
I tried all these options with no success.
I'm pretty convinced it's the file itself. Perhaps the CSV file is coded differently?
I tried creating a sample csv, and it works fine. However, this file doesn't seem to import.
A dataset with 20,000 observation is only 19 observations long
This may mean that the data needs to have a fixed record length specified when reading.
Can you post a "small" hexdump of the first few records from the file? Mark Terjeson posted the following macro on SAS-L some years back which makes it quite easy to see what a file "really" looks like. You may have to include a max parameter when calling the macro if your file's first 2 or 3 records (after the initial 4 records that your skipping) don't appear:
%macro hexdump(pathfile=,max=20000);
data _null_;
length chars $16;
retain address 0 chars '................';
infile "&pathfile" lrecl=1 recfm=f end=done;
counter+1;
if counter le &max. then do;
input c $char1.;
if mod(address,16) eq 0 then do;
if _N_ eq 1 then put address hex8. 'h: ' @;
else put ' ; ' chars / address hex8. 'h: ' @;
chars = '................';
end;
put c hex2. ' ' @;
if rank(c) ge 32 then substr(chars,mod(address,16)+1,1) = c;
address + 1;
end;
if done or counter gt &max. then do;
do i =mod(address,16)+1 to 16;
put ' ' @;
end;
put ' ; ' chars / ;
end;
run;
%mend;
%hexdump(pathfile=C:\Users\Daniel\Documents\Line.csv)
Not sure if I did this correctly, but here it is:
00000000h: 0A 22 54 6F 74 61 6C 20 4E 75 6D 62 65 72 20 6F ;
00000010h: 66 20 43 61 73 65 73 3A 22 2C 2C 22 32 32 2C 38 ;
00000020h: 39 31 22 0A 0A 22 43 61 73 65 20 23 22 2C 22 56 ;
00000030h: 72 73 6E 22 2C 22 46 44 41 20 49 6E 69 74 69 61 ;
00000040h: 6C 20 52 65 63 64 20 44 61 74 65 22 2C 22 46 44 ;
00000050h: 41 20 52 65 63 64 20 44 61 74 65 22 2C 22 41 6C ;
00000060h: 6C 20 53 75 73 70 65 63 74 73 22 2C 22 50 72 69 ;
00000070h: 6D 61 72 79 20 53 75 73 70 65 63 74 20 28 50 53 ;
00000080h: 29 22 2C 22 50 53 20 44 6F 73 65 2D 52 6F 75 74 ;
00000090h: 65 2D 46 72 65 71 75 65 6E 63 79 22 2C 22 4F 75 ;
You did it correctly, but didn't output enough characters for one to really know what the file structure might actually look like.
My first guess would be to try:
Data test;
infile 'C:\Users\Daniel\Documents\Line.csv'
firstobs=4
dlm='2C0A'x
dsd
missover
LRECL=6100
;
input
Case: $8.
Narrative: $4030.
Indication: $575.
Country: $4.
Report_Source: $12.
Image: $31.
Attach: $15.
;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.