BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kerrmc36
Fluorite | Level 6

I'm fairly new to SAS and playing around with various dirty data files.  I'm importing a delimited text file and I'm getting the � character appearing - whereby SAS then ignores the delimiter for that observation.

How do I handle the �?  It's not recognised as a character in a string ( I can't search for and remove).   I've tried using option encoding = "UTF-8". 

 

libname B "/folders/myfolders/sas_play/data";
run;

data B.beer;
	%let EFIERR_=0;
	infile '/folders/myfolders/sas_play/data/beer.txt' encoding = "UTF-8" delimiter=';' firstobs=2 MISSOVER dsd;
	length brand $ 32 brewer $ 32 percent $ 8 carbohydrates $ 8;
	input brand $ brewer $ percent $ calories carbohydrates $;
if _ERROR_ then call symputx('_EFIERR',1);
run;

/*--------------------------------------------------*/

data beer_a;
	set B.beer;
	brand = strip(brand);
	brand = tranwrd(brand, 'Carlsburg', 'Carlsberg');
	if brand = 'NA' and brewer = 'NA' then delete;
	if find(brand, 'Coopers') then brewer = 'Coopers';
	if find(brand, 'Carlsberg') then brewer = 'Carlsberg';
run;

 

Sas Output...text_view.png

 

Atom View...atom_view.png

Text File...

sas_output.png

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

This does not seem to be a UTF-encoded file, because this worked for me:

proc format;
invalue mycal
  'NA' = .
  "*" = .
  other=[32.]
;
run;

data beer;
  infile '/folders/myfolders/beer.txt' encoding = "wlatin1" delimiter=';' firstobs=2 truncover dsd;
  length brand $ 32 brewer $ 32 percent $ 8 carbohydrates $ 8;
  input brand $ brewer $ percent $ calories :mycal. carbohydrates $;
run;

You can use the custom informat as a blueprint for other informats to read the values as numbers.

View solution in original post

4 REPLIES 4
Kurt_Bremser
Super User

This does not seem to be a UTF-encoded file, because this worked for me:

proc format;
invalue mycal
  'NA' = .
  "*" = .
  other=[32.]
;
run;

data beer;
  infile '/folders/myfolders/beer.txt' encoding = "wlatin1" delimiter=';' firstobs=2 truncover dsd;
  length brand $ 32 brewer $ 32 percent $ 8 carbohydrates $ 8;
  input brand $ brewer $ percent $ calories :mycal. carbohydrates $;
run;

You can use the custom informat as a blueprint for other informats to read the values as numbers.

kerrmc36
Fluorite | Level 6

Thanks for the advice Kurt, working as per your suggestion 🙂

Ksharp
Super User
Maybe your beer.txt's encoding is not "UTF-8" . Open it by Notepad++ and see what encoding it was .
kerrmc36
Fluorite | Level 6

Thanks for the advice - will make use of notepad++ in future  🙂

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1159 views
  • 2 likes
  • 3 in conversation