Removing Special Characters

Accepted Solution Solved
Reply
Contributor
Posts: 70
Accepted Solution

Removing Special Characters

Hello everyone,

I'm trying to remove special characters that are found within the data feeds that have been inherited.

I'm looking to use the compress function to remove the special characters but I'm running into issues getting rid of it.

Depending on where I copy the special character it shows as a box or a ? mark shown below in a diamond/square.

In SAS it is showing up as a box.

In the article it is the decimal value of 157 and the hex of 9D

This is the article that is assisting me.

http://www.lexjansen.com/pharmasug/2010/cc/cc13.pdf

%let testing =testing string�;

data _null_;

  testing = COMPRESS("&testing",,'kw');

run;

What am i doing wrong seems so simple but my results are not showing as such?

Any suggestions


Accepted Solutions
Solution
‎11-19-2013 02:59 PM
Regular Contributor
Posts: 244

Re: Removing Special Characters

I get (in wlatin1,)

x=Firma Vriend Roll�

The last 3 characters are  EFBFBD, which is UTF-8 for "FFFD" - the diamond question mark you see (wlatin1 doesn't parse that properly).  That means that you already lost the actual character's value that was there before.  You can compress that out individually as a character - try "FFFD"x or "EFBFBD"x, one or the other might work (I don't have a UTF8 version available at the moment, sorry).

View solution in original post


All Replies
Regular Contributor
Posts: 244

Re: Removing Special Characters

I think the character is getting automatically converted to "?" (actual question mark character), presumably by your session not being a Unicode session.  You can either run in Unicode (how to do so depends on your SAS mode, and may not have been installed by default by your SAS administrator) or you can just remove '?'.  You also might be able to remove the specific character in the incoming data feed, depending on where the data is brought in from.

Contributor
Posts: 70

Re: Removing Special Characters

Thank you Snoopy369

Changing of the file is not an option.

I see the character within SAS as the box ascii character I would like to remove.

I'm looking to catch these going forward as well. So placing code and or an expression for the field in question is what I would like to do.

Grand Advisor
Posts: 17,325

Re: Removing Special Characters

Try some different options?

eg. keep spaces and alphabetic perhaps?

COMPRESS("&testing",,'kas');

Contributor
Posts: 70

Re: Removing Special Characters

Reeza,

I would have thought so but I'm obviously doing something wrong regarding the code because I'm not getting any of the options to work correctly as far as just manipulating the string. I just tried that and not getting correct results. Its not removing the special character.

Grand Advisor
Posts: 17,325

Re: Removing Special Characters

It did for me for the example you provided.

Does it work for that example or not for others?

Post your full testing code that please.

Contributor
Posts: 70

Re: Removing Special Characters

That is all I have in the code above and it does not remove the special character in Enterprise Guide or Code Editor in SAS DI

Solution
‎11-19-2013 02:59 PM
Regular Contributor
Posts: 244

Re: Removing Special Characters

I get (in wlatin1,)

x=Firma Vriend Roll�

The last 3 characters are  EFBFBD, which is UTF-8 for "FFFD" - the diamond question mark you see (wlatin1 doesn't parse that properly).  That means that you already lost the actual character's value that was there before.  You can compress that out individually as a character - try "FFFD"x or "EFBFBD"x, one or the other might work (I don't have a UTF8 version available at the moment, sorry).

Contributor
Posts: 70

Re: Removing Special Characters

Hey guys what I ended up doing to solve this issue was to add Latin1 to the Infile statement to make SAS read the data correctly.

Data within the SAS DI environment was showing this special characters.

By entering Latin1 in the source file properties (Advanced) it was able to render the data correctly.

Regular Contributor
Posts: 244

Re: Removing Special Characters

What version of SAS do you have?  Desktop DisplayManager, Enterprise Guide, or Server EG or batch?  What default language?  If 9.3 or 9.4, did you install as Unicode Server?  What does this code show on your SAS session?

%put &sysencoding;

What shows up when you do;

data _null_;

x="&testing";

put x= HEX.;

run;

Contributor
Posts: 70

Re: Removing Special Characters

I have not gotten it to work for any of the examples.

I used the code editor in SAS Data Integration Studio 4.6 and also tried it in 5.1.

We are using SAS 9.3 and will shortly be using 9.4 so I believe I should be good as versions go.

The encoding is UTF-8 we changed from Latin(default)

It returns utf-8 and

x=4669726D6120567269656E6420526F6C6CEFBFBD

New Contributor
Posts: 2

Re: Removing Special Characters

it may be too late but I was working on same issue and found issue. TRANWRD(kpropdata(COLUMNA ,'hex', 'utf-8'),'\xc4',' ') by doing so you can replace with space or whatever you like just put there instead of space.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 10394 views
  • 6 likes
  • 4 in conversation