05-11-2012 05:10 AM
In SAS EG4.3 I have variables that contain both standard roman alphanumeric and non-roman alphanumeric characters, specifically Japanese. I have absolutely no problem importing or viewing the Japanese character data values. However, SAS does not recognize any numeric/character text that is written in Japanese; thus all character functions are rendered useless. I know this because I created the following test:
*Var1 includes Japanese text.
If the data value for Var1 contains any alphanumeric character then the Test variable promptly returns the position of the first character. However, even when there is Japanese text in Var1, SAS does not recognize this text and returns an incorrect value of Test=0.
How do I get SAS to recognize my non-roman alphanumeric Japanese characters?
Thank you in advance,
05-11-2012 07:20 AM
You should try some K version of function such as:
klength() ksubstr() kcompress() kindex() .........
For the detail, check SAS nls documentaion.
05-14-2012 05:58 AM
Thank you for the reply!
Your suggestion regarding SAS NLS documentation, pointed me in the right direction. Since I was not able to find any other forum discussion that specifically addresses this issue, I’ll briefly discuss how I was able to resolve the problem for future reference.
Since many SAS character string functions are based on 1-byte roman alphanumeric characters, non-roman alphanumeric characters cannot be processed using these functions. Japanese alphanumeric characters happen to be 2-byte characters and thus must make use of DBCS functions [e.g., ksubstr(), kindex(), klength()]. I ended up using the klength() function in order to identify fields with non-missing values.
An entire list of DBCS functions can be found at:
Unfortunately, the number of available DBCS functions seem rather short. Although I was able to clear the first hurdle I had in cleaning my Japanese data set, I’m not sure if I will encounter further difficulty...