Hi,
im trying to read a large non delimited text file into a dataset. Im using PC SAS 9.4.
The file is the document.xml part from a docx file. It does not seem to be possible to read it into one variable and one row in a dataset since it is too large (>300kb). I was thinking that one way of doing this is to pre process the file by reading it character by character and add a CR (carriage return) every time i see a '>'. Then output it and re-read it by using proc import with CR ('0D0A'x) as a delimiter.
Can this be done. If yes then how?
BR
Jan
p.s. note that reading the file with a XML libname is not useful here.
Not sure if it will help but here is how to do what you asked.
filename in 'document.xml';
filename out 'document_fixed.xml';
data _null_;
infile in recfm=n;
file out recfm=n;
input char $char1. ;
put char $char1. ;
if char='>' then put '0D'x;
run;
I don't think that UTF-8 (or other multibyte character sets) would make any difference.
Not sure if it will help but here is how to do what you asked.
filename in 'document.xml';
filename out 'document_fixed.xml';
data _null_;
infile in recfm=n;
file out recfm=n;
input char $char1. ;
put char $char1. ;
if char='>' then put '0D'x;
run;
I don't think that UTF-8 (or other multibyte character sets) would make any difference.
Sorry i was too fast. I of course meant to write Tom
Thanks Tom!!!
😉
BR
J
@Tom wrote:
Not sure if it will help but here is how to do what you asked.
filename in 'document.xml'; filename out 'document_fixed.xml'; data _null_; infile in recfm=n; file out recfm=n; input char $char1. ; put char $char1. ; if char='>' then put '0D'x; run;
I don't think that UTF-8 (or other multibyte character sets) would make any difference.
I wonder if it would be faster to read the file as if were fixed length and apply TRANSLATE function to _INFILE_.
Hi Reeza,
that certainly did the trick. Thanks a lot!
Yes. I was considering Python as an option but prefer to keep it all in SAS.
BR
Jan
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.