Hello Experts,
I woud like to import the attached file, but I don't know how to choose the encoding option. When I try to convert this file to csv I have the same encoding problem. My code is :
filename mon_f temp;
data _null_;
infile "XXXXX\data_act.xlsx" dsd truncover lrecl=32767 encoding=wlatin1;
file mon_f lrecl=1000000;
input ligne :$char32767.;
if notdigit(scan(ligne,1,',')) and length(scan(ligne,1,','))>1 then
put '|' @;
else if length(scan(ligne,1,','))>1 then
do;
put;
end;
len=lengthn(ligne);
put ligne $varying32767. len @;
run;
proc import datafile=mon_f dbms=csv out=donnee replace;
getnames=no;
run;
Thank you very much !
Files with the extension xlsx are binary files, so I think it is difficult to process them with the file statement.
I tried the following code, and it worked somewhat.
I checked it with SAS9.4(Unicode support).
filename mon_f temp encoding='utf-8';
libname x xlsx "XXXXX\data_act.xlsx";
data _null_;
set x.TWEET_ACTIVITY_METRICS_AG2RLMP_;
file mon_f lrecl=1000000;
if notdigit(scan(Identifiant_du_Tweet__Permalien,1,',')) and length(scan(Identifiant_du_Tweet__Permalien,1,','))>1 then
put '|' @;
else if length(scan(Identifiant_du_Tweet__Permalien,1,','))>1 then
do;
put;
end;
len=lengthn(Identifiant_du_Tweet__Permalien);
put Identifiant_du_Tweet__Permalien $varying32767. len @;
run;
proc import datafile=mon_f dbms=csv out=donnee replace;
getnames=no;
run;
libname x;
If you can handle the newline in the red frame correctly, you may be able to import the file using proc import.
The SAS dataset display is garbled, but I think that's just a problem of not having a font in my SAS that can display emoji.
Files with the extension xlsx are binary files, so I think it is difficult to process them with the file statement.
I tried the following code, and it worked somewhat.
I checked it with SAS9.4(Unicode support).
filename mon_f temp encoding='utf-8';
libname x xlsx "XXXXX\data_act.xlsx";
data _null_;
set x.TWEET_ACTIVITY_METRICS_AG2RLMP_;
file mon_f lrecl=1000000;
if notdigit(scan(Identifiant_du_Tweet__Permalien,1,',')) and length(scan(Identifiant_du_Tweet__Permalien,1,','))>1 then
put '|' @;
else if length(scan(Identifiant_du_Tweet__Permalien,1,','))>1 then
do;
put;
end;
len=lengthn(Identifiant_du_Tweet__Permalien);
put Identifiant_du_Tweet__Permalien $varying32767. len @;
run;
proc import datafile=mon_f dbms=csv out=donnee replace;
getnames=no;
run;
libname x;
If you can handle the newline in the red frame correctly, you may be able to import the file using proc import.
The SAS dataset display is garbled, but I think that's just a problem of not having a font in my SAS that can display emoji.
If you get garbled characters, try using wlatin1.
filename mon_f temp encoding='wlatin1';
Please try this code.
I can get Emoji.
filename mon_f 'temp.dat' encoding='wlatin1';
libname x xlsx "C:\Users\xyzab\Downloads\data_act.xlsx";
data _null_;
set x.TWEET_ACTIVITY_METRICS_AG2RLMP_;
file mon_f lrecl=1000000;
if notdigit(scan(Identifiant_du_Tweet__Permalien,1,',')) and length(scan(Identifiant_du_Tweet__Permalien,1,','))>1 then
put '|' @;
else if length(scan(Identifiant_du_Tweet__Permalien,1,','))>1 then
do;
put;
end;
len=lengthn(Identifiant_du_Tweet__Permalien);
put Identifiant_du_Tweet__Permalien $varying32767. len @;
run;
filename mon_f;
filename mon_f 'temp.dat' encoding='utf-8' nobom;
proc import datafile=mon_f dbms=csv out=donnee replace;
getnames=no;
run;
filename mon_f;
libname x;
please try this code.
I think the Excel file was created on a PC with wlatin1 character encoding.
So, the encoding of the first filename needs to be wlatin1.
The encoding of the next filename needs to be utf-8 so that emoji can be imported into SAS.
You can use any file name you want, even temp.dat.
The following code will create the file in the work library.
filename mon_f "%sysfunc(pathname(work))\temp.dat" encoding='wlatin1';
libname x xlsx "XXXXX\data_act.xlsx";
data _null_;
set x.TWEET_ACTIVITY_METRICS_AG2RLMP_;
file mon_f lrecl=1000000;
if notdigit(scan(Identifiant_du_Tweet__Permalien,1,',')) and length(scan(Identifiant_du_Tweet__Permalien,1,','))>1 then
put '|' @;
else if length(scan(Identifiant_du_Tweet__Permalien,1,','))>1 then
do;
put;
end;
len=lengthn(Identifiant_du_Tweet__Permalien);
put Identifiant_du_Tweet__Permalien $varying32767. len @;
run;
filename mon_f;
filename mon_f "%sysfunc(pathname(work))\temp.dat" encoding='utf-8' nobom;
proc import datafile=mon_f dbms=csv out=donnee replace;
getnames=no;
run;
filename mon_f;
libname x;
Hello Kawakami,
Thank you very much ! Unfortunately, I have this error :
I can't get the expected result 😞
Maxim 3: Know Your Data.
Your variable is numeric, so you can't use a character format with it.
Please do ALWAYS post the COMPLETE log of your steps, and do ALWAYS COPY/PASTE the text into a window opened with the </> button. Do NOT post logs as pictures.
Are you running the SAS nls (National Language Support/Local language) version?
Can you run SAS by "SAS(Unicode support)" short cut?
The Excel file you are trying to work with contains unicode strings, so you should need SAS that can handle unicode, i.e. SAS unicode support.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.