- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I need to import into SAS a CSV file which contains SMS messages. The SMS message can include emojis. Whenever I use either PROC IMPORT or an INFILE statement to read in an emoji, the program fails with the message:
ERROR: Invalid string.
FATAL: Unrecoverable I/O error detected in the execution of the DATA step program. Aborted during the EXECUTION phase.
The rest of the file is not imported into SAS.
An example of a CSV file with an emoji is: (taken from the attached file)
"Delivery Status","Question Label","Question DateTime","Answer Label","Answer DateTime","Answer Text"
"Delivered","3","02/07/2019 08:22:39","Invalid Response","04/07/2019 10:31:43","😊"
I would like to either get SAS to ignore the emoji and carry on processing the CSV file, or better still, process the emoji and give a text version of this. I have found examples where people have done this with emoticons, and so I wondered if anyone had updated this to handle emojis.
I am using EG 7.15 with SAS 9.4.
I hope that this is clear.
Thanks for any help
Andrew
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @AndrewJones ,
tray the code below, it looks like you have UTF file with bit order mask (BOM). I've tested it on my Wlatin1 sas session.
Someone smarter than we ( @FreelanceReinh ) already answered this in the following thread:
%put &=sysencoding.;
filename f "C:\Users\&sysuserid.\Downloads\Sample.csv";
data _null_;
infile f recfm=N;
input c $char10.;
put c $hex20.;
stop;
run;
filename T "%sysfunc(pathname(work))/temp" recfm=N;
data _null_;
infile f recfm=N;
file t recfm=N;
if _n_=1 then input +3 @@; /* skip first three bytes (adapt if BOM is longer) */
input c $char1.;
put c $1.;
run;
filename T "%sysfunc(pathname(work))/temp" recfm=V;
data test;
infile t firstobs=2 DSD ; /*dlm='0a0d'x;*/
input (DeliveryStatus QuestionLabel QuestionDateTime AnswerLabel AnswerDateTime AnswerText) (: $ 20.);
run;
All the best
Bart
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug
"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings
SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @AndrewJones ,
I did this on your example:
filename f "C:\Users\&sysuserid.\Downloads\Sample.csv" encoding='utf-8';
data test;
infile f firstobs=2 DSD;
input (DeliveryStatus QuestionLabel QuestionDateTime AnswerLabel AnswerDateTime AnswerText) (: $ 20.);
run;
and it worked. Are you working in UTF-8 session?
All the best
Bart
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug
"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings
SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Bart,
Thanks for your quick reply. I tried your code below and I still got the error message. I am working in a WLATIN1 environment. SAS is installed on a server and so I don't want to change the default encoding since this will affect all other users. Is there are way to handle this programmatically?
Thanks
Andrew
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @AndrewJones ,
tray the code below, it looks like you have UTF file with bit order mask (BOM). I've tested it on my Wlatin1 sas session.
Someone smarter than we ( @FreelanceReinh ) already answered this in the following thread:
%put &=sysencoding.;
filename f "C:\Users\&sysuserid.\Downloads\Sample.csv";
data _null_;
infile f recfm=N;
input c $char10.;
put c $hex20.;
stop;
run;
filename T "%sysfunc(pathname(work))/temp" recfm=N;
data _null_;
infile f recfm=N;
file t recfm=N;
if _n_=1 then input +3 @@; /* skip first three bytes (adapt if BOM is longer) */
input c $char1.;
put c $1.;
run;
filename T "%sysfunc(pathname(work))/temp" recfm=V;
data test;
infile t firstobs=2 DSD ; /*dlm='0a0d'x;*/
input (DeliveryStatus QuestionLabel QuestionDateTime AnswerLabel AnswerDateTime AnswerText) (: $ 20.);
run;
All the best
Bart
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug
"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings
SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content