Hi,
I used a file statement to output a file in csv format and the client claims that they see BOM in front of the first variable. They need the file in uft-8 format as well. The first variable is a char of name. Below is the query I used. Could someone please help me figure out how to remove the BOM from the output?
data _null_;
set xx;
file "mypath\myfile.csv" dlm=',' encoding='utf-8'
;
if _n_=1 then
put
list of variables...
;
run;
Thanks!
Try the BOMFILE/NOBOMFILE option.
Run this program with BOMFILE on and off and see the difference in the first 4 characters of the generated file.
options bomfile ;
filename xx temp;
data _null_;
file xx encoding=utf8;
put 'Hello.';
run;
data _null_;
infile xx recfm=n ;
input x $char4. ;
put x $hex8. ;
stop;
run;
Many thanks, Tom. I have tried the code, which works. My problem is that I'm outputting a file for a vendor. They said that my file has a BMO. In your code, I think I was able to see the values but how should I output the file without producing BMO so the vendor could load the file correctly? Thanks again!
Thanks, Tom. I tried your code and it worked. My problem is, I'm outputting a file to a vendor. They said there is BOM in my file. I was able to see the values (which I was unable to see from csv) using your code; however, how do I ensure the output file does not contain the BOM so the vendor could load in my output file correctly? Many thanks again!
So did you run the code OPTIONS NOBOMFILE set?
Let me be a little more explicit.
First run this line so that SAS will stop writing the BOM to the beginnng of files.
options NObomfile ;
Then produce the CSV file the same as you did before.
To test that it does NOT have the BOM use this simple data step to look at the first 4 characters. If they are the first 4 characters of the first field name in the CSV file then there is no BOM.
data _null_;
infile 'name_of_csvfile' recfm=n ;
input x $char4. ;
put x $hex8. ;
stop;
run;
Thanks again, Tom.
Below is what I ran and the output from the put statement. I sent to the vendor the file I produced using 'options nobomfile' to verify. Will let you know how it went.
Code:
options nobomfile ;
data _null_;
file "mypath\myfile.csv" encoding=utf8;
put 'Hello.';
run;
data _null_;
infile "mypath\myfile.csv" recfm=n ;
input x :$char8. ;
put x $hex8. ;
stop;
run;
Output:
48656C6C
Code:
options bomfile ;
data _null_;
file "mypath\myfile.csv" encoding=utf8;
put 'Hello.';
run;
data _null_;
infile "mypath\myfile.csv" recfm=n ;
input x :$char8. ;
put x $hex8. ;
stop;
run;
Output:
EFBBBF48
I sent out the last message accidentally. That was what I ran. Were the outputs expected? I thought I would expect 'Hello' with the nobomfile option but I'm not. Is that correct? Thanks again Tom!
The hex representation of the first four characters of 'Hello' is '48656C6C'x.
Is that not what you got?
It is what I got I sent a sample file to the vendor to verify. Will keep you posted on how it goes. Thanks again for your help, Tom!
Confirmed with vendor. They don't see BOM mark anymore. Many thanks everyone. Issue resolved!
Utf8 is becoming the standard and most tools are following that.
The other part to solve will get you into encoding challenges that can be more dramatic.
Check that part for agreement on the data. Ascii is just 7 bit the extended Ascii seen with windows has a lot variability.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.