BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jdmarshg
Obsidian | Level 7

Hello everyone,

I'm trying to remove special characters that are found within the data feeds that have been inherited.

I'm looking to use the compress function to remove the special characters but I'm running into issues getting rid of it.

Depending on where I copy the special character it shows as a box or a ? mark shown below in a diamond/square.

In SAS it is showing up as a box.

In the article it is the decimal value of 157 and the hex of 9D

This is the article that is assisting me.

http://www.lexjansen.com/pharmasug/2010/cc/cc13.pdf

%let testing =testing string�;

data _null_;

  testing = COMPRESS("&testing",,'kw');

run;

What am i doing wrong seems so simple but my results are not showing as such?

Any suggestions

1 ACCEPTED SOLUTION

Accepted Solutions
snoopy369
Barite | Level 11

I get (in wlatin1,)

x=Firma Vriend Roll�

The last 3 characters are  EFBFBD, which is UTF-8 for "FFFD" - the diamond question mark you see (wlatin1 doesn't parse that properly).  That means that you already lost the actual character's value that was there before.  You can compress that out individually as a character - try "FFFD"x or "EFBFBD"x, one or the other might work (I don't have a UTF8 version available at the moment, sorry).

View solution in original post

11 REPLIES 11
snoopy369
Barite | Level 11

I think the character is getting automatically converted to "?" (actual question mark character), presumably by your session not being a Unicode session.  You can either run in Unicode (how to do so depends on your SAS mode, and may not have been installed by default by your SAS administrator) or you can just remove '?'.  You also might be able to remove the specific character in the incoming data feed, depending on where the data is brought in from.

jdmarshg
Obsidian | Level 7

Thank you Snoopy369

Changing of the file is not an option.

I see the character within SAS as the box ascii character I would like to remove.

I'm looking to catch these going forward as well. So placing code and or an expression for the field in question is what I would like to do.

Reeza
Super User

Try some different options?

eg. keep spaces and alphabetic perhaps?

COMPRESS("&testing",,'kas');

jdmarshg
Obsidian | Level 7

Reeza,

I would have thought so but I'm obviously doing something wrong regarding the code because I'm not getting any of the options to work correctly as far as just manipulating the string. I just tried that and not getting correct results. Its not removing the special character.

Reeza
Super User

It did for me for the example you provided.

Does it work for that example or not for others?

Post your full testing code that please.

jdmarshg
Obsidian | Level 7

That is all I have in the code above and it does not remove the special character in Enterprise Guide or Code Editor in SAS DI

snoopy369
Barite | Level 11

I get (in wlatin1,)

x=Firma Vriend Roll�

The last 3 characters are  EFBFBD, which is UTF-8 for "FFFD" - the diamond question mark you see (wlatin1 doesn't parse that properly).  That means that you already lost the actual character's value that was there before.  You can compress that out individually as a character - try "FFFD"x or "EFBFBD"x, one or the other might work (I don't have a UTF8 version available at the moment, sorry).

jdmarshg
Obsidian | Level 7

Hey guys what I ended up doing to solve this issue was to add Latin1 to the Infile statement to make SAS read the data correctly.

Data within the SAS DI environment was showing this special characters.

By entering Latin1 in the source file properties (Advanced) it was able to render the data correctly.

snoopy369
Barite | Level 11

What version of SAS do you have?  Desktop DisplayManager, Enterprise Guide, or Server EG or batch?  What default language?  If 9.3 or 9.4, did you install as Unicode Server?  What does this code show on your SAS session?

%put &sysencoding;

What shows up when you do;

data _null_;

x="&testing";

put x= HEX.;

run;

jdmarshg
Obsidian | Level 7

I have not gotten it to work for any of the examples.

I used the code editor in SAS Data Integration Studio 4.6 and also tried it in 5.1.

We are using SAS 9.3 and will shortly be using 9.4 so I believe I should be good as versions go.

The encoding is UTF-8 we changed from Latin(default)

It returns utf-8 and

x=4669726D6120567269656E6420526F6C6CEFBFBD

SukhwinderSingh
Calcite | Level 5

it may be too late but I was working on same issue and found issue. TRANWRD(kpropdata(COLUMNA ,'hex', 'utf-8'),'\xc4',' ') by doing so you can replace with space or whatever you like just put there instead of space.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 25528 views
  • 8 likes
  • 4 in conversation