Hello everyone,
I'm trying to remove special characters that are found within the data feeds that have been inherited.
I'm looking to use the compress function to remove the special characters but I'm running into issues getting rid of it.
Depending on where I copy the special character it shows as a box or a ? mark shown below in a diamond/square.
In SAS it is showing up as a box.
In the article it is the decimal value of 157 and the hex of 9D
This is the article that is assisting me.
http://www.lexjansen.com/pharmasug/2010/cc/cc13.pdf
%let testing =testing string�;
data _null_;
testing = COMPRESS("&testing",,'kw');
run;
What am i doing wrong seems so simple but my results are not showing as such?
Any suggestions
I get (in wlatin1,)
x=Firma Vriend Roll�
The last 3 characters are EFBFBD, which is UTF-8 for "FFFD" - the diamond question mark you see (wlatin1 doesn't parse that properly). That means that you already lost the actual character's value that was there before. You can compress that out individually as a character - try "FFFD"x or "EFBFBD"x, one or the other might work (I don't have a UTF8 version available at the moment, sorry).
I think the character is getting automatically converted to "?" (actual question mark character), presumably by your session not being a Unicode session. You can either run in Unicode (how to do so depends on your SAS mode, and may not have been installed by default by your SAS administrator) or you can just remove '?'. You also might be able to remove the specific character in the incoming data feed, depending on where the data is brought in from.
Thank you Snoopy369
Changing of the file is not an option.
I see the character within SAS as the box ascii character I would like to remove.
I'm looking to catch these going forward as well. So placing code and or an expression for the field in question is what I would like to do.
Try some different options?
eg. keep spaces and alphabetic perhaps?
COMPRESS("&testing",,'kas');
Reeza,
I would have thought so but I'm obviously doing something wrong regarding the code because I'm not getting any of the options to work correctly as far as just manipulating the string. I just tried that and not getting correct results. Its not removing the special character.
It did for me for the example you provided.
Does it work for that example or not for others?
Post your full testing code that please.
That is all I have in the code above and it does not remove the special character in Enterprise Guide or Code Editor in SAS DI
I get (in wlatin1,)
x=Firma Vriend Roll�
The last 3 characters are EFBFBD, which is UTF-8 for "FFFD" - the diamond question mark you see (wlatin1 doesn't parse that properly). That means that you already lost the actual character's value that was there before. You can compress that out individually as a character - try "FFFD"x or "EFBFBD"x, one or the other might work (I don't have a UTF8 version available at the moment, sorry).
Hey guys what I ended up doing to solve this issue was to add Latin1 to the Infile statement to make SAS read the data correctly.
Data within the SAS DI environment was showing this special characters.
By entering Latin1 in the source file properties (Advanced) it was able to render the data correctly.
What version of SAS do you have? Desktop DisplayManager, Enterprise Guide, or Server EG or batch? What default language? If 9.3 or 9.4, did you install as Unicode Server? What does this code show on your SAS session?
%put &sysencoding;
What shows up when you do;
data _null_;
x="&testing";
put x= HEX.;
run;
I have not gotten it to work for any of the examples.
I used the code editor in SAS Data Integration Studio 4.6 and also tried it in 5.1.
We are using SAS 9.3 and will shortly be using 9.4 so I believe I should be good as versions go.
The encoding is UTF-8 we changed from Latin(default)
It returns utf-8 and
x=4669726D6120567269656E6420526F6C6CEFBFBD
it may be too late but I was working on same issue and found issue. TRANWRD(kpropdata(COLUMNA ,'hex', 'utf-8'),'\xc4',' ') by doing so you can replace with space or whatever you like just put there instead of space.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.