Hello.
I'm having an issue with importing a CSV file (I believe queried and given to me from a big database) into SAS when every string is in quotes as listed below
( "File","O,H,H","Yes")
However, if I open the CSV in excel and save over it, it results in the follow:
(File,"O,H,H",Yes)
and this imports just fine.
Is there a way to import without having excel involved?
Code:
Data test;
infile 'C:\Users\Daniel\Documents\Line.csv'
TERMSTR=CRLF
firstobs=4
dlm=','
dsd
missover
LRECL=6100
;
input
Case: $8.
Narrative: $4030.
Indication: $575.
Country: $4.
Report_Source: $12.
Image: $31.
Attach: $15.
;
run;
what is the issue in when importing directly from csv?
A dataset with 20,000 observation is only 19 observations long and fields with incorrect information in them.
My first thought is a different termstr value that is getting replaced when opened/saved in Excel.
Is there any other value other than: CRLF, CR, and LF?
Tried all three and none worked. Perhaps the CSV file given to me is in a format SAS cannot read?
It's certainly possible to construct a file that uses something else. At times, I've even had to open the file in a byte editor to look at what's actually there.
I see.
The only question I have left is, which is really just asking the question again:
Can SAS import a CSV where all the strings are in quotation marks.
ex.
"103943","O,H,J","File","Patient blah, blah, blah"
That should not be an issue. You can prove that by creating a sample file and trying it.
Sorry, remove the dlm=',', the dad option should cover that.
Also, you usually need truncover instead of missover.
Thank you for your input.
I tried all these options with no success.
I'm pretty convinced it's the file itself. Perhaps the CSV file is coded differently?
I tried creating a sample csv, and it works fine. However, this file doesn't seem to import.
A dataset with 20,000 observation is only 19 observations long
This may mean that the data needs to have a fixed record length specified when reading.
Can you post a "small" hexdump of the first few records from the file? Mark Terjeson posted the following macro on SAS-L some years back which makes it quite easy to see what a file "really" looks like. You may have to include a max parameter when calling the macro if your file's first 2 or 3 records (after the initial 4 records that your skipping) don't appear:
%macro hexdump(pathfile=,max=20000);
data _null_;
length chars $16;
retain address 0 chars '................';
infile "&pathfile" lrecl=1 recfm=f end=done;
counter+1;
if counter le &max. then do;
input c $char1.;
if mod(address,16) eq 0 then do;
if _N_ eq 1 then put address hex8. 'h: ' @;
else put ' ; ' chars / address hex8. 'h: ' @;
chars = '................';
end;
put c hex2. ' ' @;
if rank(c) ge 32 then substr(chars,mod(address,16)+1,1) = c;
address + 1;
end;
if done or counter gt &max. then do;
do i =mod(address,16)+1 to 16;
put ' ' @;
end;
put ' ; ' chars / ;
end;
run;
%mend;
%hexdump(pathfile=C:\Users\Daniel\Documents\Line.csv)
Not sure if I did this correctly, but here it is:
00000000h: 0A 22 54 6F 74 61 6C 20 4E 75 6D 62 65 72 20 6F ;
00000010h: 66 20 43 61 73 65 73 3A 22 2C 2C 22 32 32 2C 38 ;
00000020h: 39 31 22 0A 0A 22 43 61 73 65 20 23 22 2C 22 56 ;
00000030h: 72 73 6E 22 2C 22 46 44 41 20 49 6E 69 74 69 61 ;
00000040h: 6C 20 52 65 63 64 20 44 61 74 65 22 2C 22 46 44 ;
00000050h: 41 20 52 65 63 64 20 44 61 74 65 22 2C 22 41 6C ;
00000060h: 6C 20 53 75 73 70 65 63 74 73 22 2C 22 50 72 69 ;
00000070h: 6D 61 72 79 20 53 75 73 70 65 63 74 20 28 50 53 ;
00000080h: 29 22 2C 22 50 53 20 44 6F 73 65 2D 52 6F 75 74 ;
00000090h: 65 2D 46 72 65 71 75 65 6E 63 79 22 2C 22 4F 75 ;
You did it correctly, but didn't output enough characters for one to really know what the file structure might actually look like.
My first guess would be to try:
Data test;
infile 'C:\Users\Daniel\Documents\Line.csv'
firstobs=4
dlm='2C0A'x
dsd
missover
LRECL=6100
;
input
Case: $8.
Narrative: $4030.
Indication: $575.
Country: $4.
Report_Source: $12.
Image: $31.
Attach: $15.
;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.