Dear forum experts
In our data we are getting unicode characters of some DBCS characters. for e.g the symbol for mu is converted as μ . We are having lot of such characters in our RDE data.
We are not sure how we should handle these texts. These are important characters and we do not know how to process them.
Please let me know.
Please check the screenshot also.
Thanks for your help.
Anand
How do you want them handled? Do you want them stripped out? Read in and displayed as mu?
Thanks Reeza. I did stripped those characters. But we came to know that these are required and we have to convert them back to their actual values.
Well, I'm a little confused by the representation of the unicode characters that you're seeing, but I'll offer my 2 cents. The format "&#n;" is, in the unicode world, called the "numeric character representation" or NCR, where "n" is a number, and the other characters are literal. In your screenshot, I'm afraid I don't know what the leading "/" or the trailing "l" are for. In any event, you should be able to strip out those characters, and then convert what's left with the SAS unicode() function. Here's an example:
data one;
input wbc wbcoth_uni $;
wbcoth = unicode(wbcoth_uni,'ncr');
datalines4;
3690 μ
;;;;
run;
When I open the table "one" in ViewTable, I see a mu in the wbcoth column. Please note that you do need to be running the unicode version of SAS, which may not be the default at your institution. On my Windows system, it's in the start menu-->All Programs-->SAS-->Additional Languages-->SAS 9.3 (unicode support).
HTH
Karl
Check some options :
infile x encoding=dbcs recft= termstr=
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.