BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Shayan2012
Quartz | Level 8

Hello Everyone, 

 

The title might not be accurate since I am not familiar with encoding, but here is my problem in simple words: I have a variable which is actually a list of names of people. Apparently, some of these names are Spanish or French, so they have characters which I belive are called "hexadecimal characters", such as  E with an accent above it, or a lowercase i with umlaut above it. ( I dont know how to type them, some examples are attached in the picture.) 

 

I want to convert all of them into regular characters, for example, E with dots into E, etc. 

 

I thought compress function should be the right way, so first I tried to just keep the alphabets like this:

 

data test2;
   set test;
   names_translate = compress(name2,'','ka');
run;

It does not work unfortunately, and those charachters remain there. I played with other modifiers, such as 'c' or 'w' but those do not seem to give me what I want either. I was wondering if there is a neat method with compress function, or any other function that gives me the desired result? In the picture below I have shown basically what I have and what I want to get as output.

 

Example

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

The function you are going to want is TRANSLATE. The characters are more likely to be "high order ASCII" or similar which are representations of ASCII values greater than 126.

The data set may help:

data work.highorderascii;
   do i= 127 to 255;
      char = byte(i);
      output;
   end;
run;

Here is an example using translate function that may work for you.

 

data example;
   x='Andrè';
   y=translate(x,'AAAAAAACEEEEIIIIDNOOOOO OUUUUY Saaaaaaaceeeeiiiidnooooo ouuuuy y',
                 'ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ');
run;

The value in the first long string replaces the corresponding value in the second string, which is why I show them one over the other above. The comparison is case sensitive and I have used what I believe to be the common replace for most of those going into English. If you need a different rule it should be easy to manipulate.

 

View solution in original post

2 REPLIES 2
ballardw
Super User

The function you are going to want is TRANSLATE. The characters are more likely to be "high order ASCII" or similar which are representations of ASCII values greater than 126.

The data set may help:

data work.highorderascii;
   do i= 127 to 255;
      char = byte(i);
      output;
   end;
run;

Here is an example using translate function that may work for you.

 

data example;
   x='Andrè';
   y=translate(x,'AAAAAAACEEEEIIIIDNOOOOO OUUUUY Saaaaaaaceeeeiiiidnooooo ouuuuy y',
                 'ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ');
run;

The value in the first long string replaces the corresponding value in the second string, which is why I show them one over the other above. The comparison is case sensitive and I have used what I believe to be the common replace for most of those going into English. If you need a different rule it should be easy to manipulate.

 

Shayan2012
Quartz | Level 8
Thanks a lot, ballardw. That is exactly what I was looking for!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 21083 views
  • 3 likes
  • 2 in conversation