BookmarkSubscribeRSS Feed
rgettys
Fluorite | Level 6

I am importing a dataset with thousands of names, many which have an accented a, e, n, i, o, or u. 

In a data step, I tried to do the following

varname=translate(varname,"o","ó");
put varname;

and

varname=tranwrd(varname,"í","i");
put varname;

But neither worked. In the dataset I am still getting � as my response and I am not able to get proc freq output without exporting it to HTML or csv first. It makes analysis really clunky. 

 

I am using EG 7.13

 

Any suggestions?

7 REPLIES 7
rgettys
Fluorite | Level 6

I tried it but it returns blanks instead of letters. i.e. méxico comes back as m xico.

ballardw
Super User

I would check you datasets encoding characteristic. What you are showing sounds almost like a double-byte character set.

 

Can you show us what the result of this code would be for one of your trouble values:

 

put 'Before ' varname=;

varname=translate(varname,"o","ó");
put 'After ' varname=;

 

This would let us see if the translate is even effecting the value.

 

 

rgettys
Fluorite | Level 6

The orignal database has México Mérida

 

after running 

Data test; set original;
put 'méxico' var4=;
var4=translate(var4,"e","é");
put 'mexico' var4=;
run;

the output is "Me xico Me rida" which is closer but still strange.

rgettys
Fluorite | Level 6

I am not sure why this is working now, but it is. I didn't change any settings during proc import and did not update EG 7.13 to a new version but now accented characters are not showing as � and I am able to work with the variables?

Data test; set dataset;
var4=tranwrd(var4,"é","e");
var4=tranwrd(var4,"í","i");
var4=tranwrd(var4,"ã","a");
var4=tranwrd(var4,"ó","o");
var4=tranwrd(var4,"á","a");
var4=tranwrd(var4,"â","a");
var4=tranwrd(var4,"ñ","n");
var4=tranwrd(var4,"ú","u");
run;

Now any response in the column, whether it be México México City or Guatemala Cobán etc. changes with the code above. I must have had a coding error in previous versions, but that does not explain why in the original dataset the accented values returned as �.

Patrick
Opal | Level 21

SAS Sessions run with a defined session encoding and run either a single byte or multi byte.

 

Your source data will also have an encoding and this encoding can be different from your SAS session encoding. When SAS reads your data it needs to use a translation table to map the source character encoding to the target character encoding.

 

The documentation for all of this here:

http://support.sas.com/documentation/cdl/en/nlsref/69741/HTML/default/viewer.htm#n1au6s0oh1rp4en1nbp...

 

 

Several things can go wrong:

1. There is no 1:1 character mapping possible. That can happen if you run your SAS session in single byte mode but the source is in multibyte like UTF-8 and contains multibyte encoded characters which simply can't get mapped 1:1 to a single byte representation.

SAS will throw a transcoding error in such cases.

 

2. Your source data's encoding is "misleading" and SAS assumes the wrong encoding. I've seen this happen for UTF-8 without a BOM. If there is no transcoding error then what can happen is that SAS uses the wrong character mapping and you get garbeled characters.

 

3. The last options is that the character mapping between source and target works but your client has a different encoding and then prints a different symbol. In this case everything is fine with your internally stored value and it's just about printing.

 

Given all of the above it's actually amazing that things work most of the time without us having to provide specific instructions to SAS (like parameter options for inencoding and encoding per file/table).

 

 

Thanks,

Patrick

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 7977 views
  • 0 likes
  • 5 in conversation