I'm fairly new to SAS and playing around with various dirty data files. I'm importing a delimited text file and I'm getting the � character appearing - whereby SAS then ignores the delimiter for that observation.
How do I handle the �? It's not recognised as a character in a string ( I can't search for and remove). I've tried using option encoding = "UTF-8".
libname B "/folders/myfolders/sas_play/data";
run;
data B.beer;
%let EFIERR_=0;
infile '/folders/myfolders/sas_play/data/beer.txt' encoding = "UTF-8" delimiter=';' firstobs=2 MISSOVER dsd;
length brand $ 32 brewer $ 32 percent $ 8 carbohydrates $ 8;
input brand $ brewer $ percent $ calories carbohydrates $;
if _ERROR_ then call symputx('_EFIERR',1);
run;
/*--------------------------------------------------*/
data beer_a;
set B.beer;
brand = strip(brand);
brand = tranwrd(brand, 'Carlsburg', 'Carlsberg');
if brand = 'NA' and brewer = 'NA' then delete;
if find(brand, 'Coopers') then brewer = 'Coopers';
if find(brand, 'Carlsberg') then brewer = 'Carlsberg';
run;
Sas Output...
Atom View...
Text File...
This does not seem to be a UTF-encoded file, because this worked for me:
proc format;
invalue mycal
'NA' = .
"*" = .
other=[32.]
;
run;
data beer;
infile '/folders/myfolders/beer.txt' encoding = "wlatin1" delimiter=';' firstobs=2 truncover dsd;
length brand $ 32 brewer $ 32 percent $ 8 carbohydrates $ 8;
input brand $ brewer $ percent $ calories :mycal. carbohydrates $;
run;
You can use the custom informat as a blueprint for other informats to read the values as numbers.
This does not seem to be a UTF-encoded file, because this worked for me:
proc format;
invalue mycal
'NA' = .
"*" = .
other=[32.]
;
run;
data beer;
infile '/folders/myfolders/beer.txt' encoding = "wlatin1" delimiter=';' firstobs=2 truncover dsd;
length brand $ 32 brewer $ 32 percent $ 8 carbohydrates $ 8;
input brand $ brewer $ percent $ calories :mycal. carbohydrates $;
run;
You can use the custom informat as a blueprint for other informats to read the values as numbers.
Thanks for the advice Kurt, working as per your suggestion 🙂
Thanks for the advice - will make use of notepad++ in future 🙂
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.