Hello,
I need to import into SAS a CSV file which contains SMS messages. The SMS message can include emojis. Whenever I use either PROC IMPORT or an INFILE statement to read in an emoji, the program fails with the message:
ERROR: Invalid string.
FATAL: Unrecoverable I/O error detected in the execution of the DATA step program. Aborted during the EXECUTION phase.
The rest of the file is not imported into SAS.
An example of a CSV file with an emoji is: (taken from the attached file)
"Delivery Status","Question Label","Question DateTime","Answer Label","Answer DateTime","Answer Text"
"Delivered","3","02/07/2019 08:22:39","Invalid Response","04/07/2019 10:31:43","😊"
I would like to either get SAS to ignore the emoji and carry on processing the CSV file, or better still, process the emoji and give a text version of this. I have found examples where people have done this with emoticons, and so I wondered if anyone had updated this to handle emojis.
I am using EG 7.15 with SAS 9.4.
I hope that this is clear.
Thanks for any help
Andrew
Hi @AndrewJones ,
tray the code below, it looks like you have UTF file with bit order mask (BOM). I've tested it on my Wlatin1 sas session.
Someone smarter than we ( @FreelanceReinh ) already answered this in the following thread:
%put &=sysencoding.;
filename f "C:\Users\&sysuserid.\Downloads\Sample.csv";
data _null_;
infile f recfm=N;
input c $char10.;
put c $hex20.;
stop;
run;
filename T "%sysfunc(pathname(work))/temp" recfm=N;
data _null_;
infile f recfm=N;
file t recfm=N;
if _n_=1 then input +3 @@; /* skip first three bytes (adapt if BOM is longer) */
input c $char1.;
put c $1.;
run;
filename T "%sysfunc(pathname(work))/temp" recfm=V;
data test;
infile t firstobs=2 DSD ; /*dlm='0a0d'x;*/
input (DeliveryStatus QuestionLabel QuestionDateTime AnswerLabel AnswerDateTime AnswerText) (: $ 20.);
run;
All the best
Bart
Hi @AndrewJones ,
I did this on your example:
filename f "C:\Users\&sysuserid.\Downloads\Sample.csv" encoding='utf-8';
data test;
infile f firstobs=2 DSD;
input (DeliveryStatus QuestionLabel QuestionDateTime AnswerLabel AnswerDateTime AnswerText) (: $ 20.);
run;
and it worked. Are you working in UTF-8 session?
All the best
Bart
Hi Bart,
Thanks for your quick reply. I tried your code below and I still got the error message. I am working in a WLATIN1 environment. SAS is installed on a server and so I don't want to change the default encoding since this will affect all other users. Is there are way to handle this programmatically?
Thanks
Andrew
Hi @AndrewJones ,
tray the code below, it looks like you have UTF file with bit order mask (BOM). I've tested it on my Wlatin1 sas session.
Someone smarter than we ( @FreelanceReinh ) already answered this in the following thread:
%put &=sysencoding.;
filename f "C:\Users\&sysuserid.\Downloads\Sample.csv";
data _null_;
infile f recfm=N;
input c $char10.;
put c $hex20.;
stop;
run;
filename T "%sysfunc(pathname(work))/temp" recfm=N;
data _null_;
infile f recfm=N;
file t recfm=N;
if _n_=1 then input +3 @@; /* skip first three bytes (adapt if BOM is longer) */
input c $char1.;
put c $1.;
run;
filename T "%sysfunc(pathname(work))/temp" recfm=V;
data test;
infile t firstobs=2 DSD ; /*dlm='0a0d'x;*/
input (DeliveryStatus QuestionLabel QuestionDateTime AnswerLabel AnswerDateTime AnswerText) (: $ 20.);
run;
All the best
Bart
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.