DATA Step, Macro, Functions and more

KUPCASE changing "μ" to "M" in SAS 9.4

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

KUPCASE changing "μ" to "M" in SAS 9.4

Hi,

 

My understanding reading KUPCASE documentation is this will only upcase single byte characters.  The code below running in SAS 9.4 is changing the double byte character "μ" to "M".  I need this to stay as "μ" but upcase all other characters.  Any ideas?


data atest;

x = "μ";
len = length(x);
klen = klength(x);
up = upcase(x);
kup = kupcase(x);

run;

 

Thank you in advance for any help you can offer.

 

Regards,

--Nick


Accepted Solutions
Solution
‎08-03-2017 08:14 AM
Super Contributor
Posts: 345

Re: KUPCASE changing "μ" to "M" in SAS 9.4

[ Edited ]

There is nothing wrong in the behaviour of kupcase. µ is a normal lowcase letter in the greek alphabet, M is the upcase version. See https://en.wikipedia.org/wiki/Mu_(letter)

 

@Patrick shared this tip:

You could use ktranslate() as below if you only want to target English letters.

 

data atest;
x = "aμX";
len = length(x);
klen = klength(x);
up = upcase(x);
kup = kupcase(x);
ktrans=ktranslate(x,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz');
put _all_;
run;

 

View solution in original post


All Replies
PROC Star
Posts: 7,468

Re: KUPCASE changing "μ" to "M" in SAS 9.4

You could always use the grunt approach:

data atest;
  length x $4;
  input x $;
  len = length(x);
  klen = klength(x);
  up = upcase(x);
  kup = kupcase(x);
  do _n_=1 to length(x);
    if rank(substr(x,_n_,1)) in (97:122) then substr(x,_n_,1)=upcase(substr(x,_n_,1));
  end;
  cards;
μ
abc
aBc
deμ
;

Art, CEO, AnalystFinder.com

 

Occasional Contributor
Posts: 7

Re: KUPCASE changing "μ" to "M" in SAS 9.4

Thanks for the input.  This will work but it defeats the purpose of the kupcase function.  I guess I can dump this rank code to select lowercase a-z characters into a fcmp and use this as a function.

 

I'm still interested in why kupcase doesn't work so I'm going to leave this open for now and hope someone adds additional info.

Respected Advisor
Posts: 4,173

Re: KUPCASE changing "μ" to "M" in SAS 9.4

@nmasel

You could use ktranslate() as below if you only want to target English letters.

data atest;
x = "aμX";
len = length(x);
klen = klength(x);
up = upcase(x);
kup = kupcase(x);
ktrans=ktranslate(x,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz');
put _all_;
run;
Super User
Super User
Posts: 7,039

Re: KUPCASE changing "μ" to "M" in SAS 9.4

I don't see any way to tell KUPCASE() what letters combinations make up lower/upper case pairs.

You could try using KTRANSLATE() to turn the mu into some other unused character and then translate it back.

data atest;
  length x $4;
  input x $;
  len = length(x);
  klen = klength(x);
  up = upcase(x);
  kup = kupcase(x);
  kup2 = ktranslate(kupcase(ktranslate(x,'|','μ')),'μ','|');
cards4;
μ
abc
aBc
deμ
;;;;
Occasional Contributor
Posts: 7

Re: KUPCASE changing "μ" to "M" in SAS 9.4

Thanks for the input. This would work too, but I would have to be certain of a character that I could sub in and out. I'm wondering what else KUPCASE does not handle correctly in a UTF-8 environment. If there are multiple than ART297's approach could limit this to all characters that can be upcased.
Super User
Super User
Posts: 7,039

Re: KUPCASE changing "μ" to "M" in SAS 9.4

You should probably open a ticket with SAS support to find out exactly how KUPCASE() is matching upper and lower case letters.

 

In terms of finding an available character to use for the KTRANSLATE() trick I normally use COMPRESS() function.  So I guess for this you could use KCOMPRESS()?

 

So here is logic that can work with single byte character sets.  So for example to convert the letter X to something that is not in STRING you could use this.

possible_chars=collate(0,255);
unused=char(compress(possible_chars,string),1);
new_string=translate(string,unused,'X');
Super User
Posts: 11,343

Re: KUPCASE changing "μ" to "M" in SAS 9.4


nmasel wrote:

Hi,

 

My understanding reading KUPCASE documentation is this will only upcase single byte characters.  The code below running in SAS 9.4 is changing the double byte character "μ" to "M".  I need this to stay as "μ" but upcase all other characters.  Any ideas?


I believe you are misunderstaning the definition of the function: "Converts all single-width English alphabet letters in an argument to uppercase". And since enough Greek letters are used in certain English writing ...

 

That is NOT single-byte. The Kupcase functions at the I18N Level 2 for string manipulation which means that this function can be used for SBCS, DBCS, and MBCS (UTF-8) data.

 

 

Solution
‎08-03-2017 08:14 AM
Super Contributor
Posts: 345

Re: KUPCASE changing "μ" to "M" in SAS 9.4

[ Edited ]

There is nothing wrong in the behaviour of kupcase. µ is a normal lowcase letter in the greek alphabet, M is the upcase version. See https://en.wikipedia.org/wiki/Mu_(letter)

 

@Patrick shared this tip:

You could use ktranslate() as below if you only want to target English letters.

 

data atest;
x = "aμX";
len = length(x);
klen = klength(x);
up = upcase(x);
kup = kupcase(x);
ktrans=ktranslate(x,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz');
put _all_;
run;

 

Occasional Contributor
Posts: 7

Re: KUPCASE changing "μ" to "M" in SAS 9.4

Posted in reply to andreas_lds

Thanks, this makes much more sense!  The data is upcased prior to me recieving this so "µg" is turning into "MG" which are very different units.  All of the methods posted by others are great options to exclude µ from the kupcase.

PROC Star
Posts: 7,468

Re: KUPCASE changing "μ" to "M" in SAS 9.4

@nmasel: You've marked this as solved, but the problem is different than stated in your original post and I only see an explanation, not a solution.

If I correctly understand it now, you have a file that has already been upcased and now includes upper case Greek characters .. which you don't want.

I'd still go with a grunt approach, but slightly different than the one I originally suggested:

data atest;
  length x change_back $4;
  input x $;
  up = upcase(x);
  call missing(change_back);
  do _n_=1 to klength(up);
    if length(ksubstr(up,_n_,1)) ne klength(ksubstr(up,_n_,1)) then
     change_back=catt(change_back,lowcase(ksubstr(up,_n_,1)));
    else change_back=catt(change_back,ksubstr(up,_n_,1));
  end;
  cards;
μ
abc
AbC
dEμ
;

Art, CEO, AnalystFinder.com

 

Occasional Contributor
Posts: 7

Re: KUPCASE changing "μ" to "M" in SAS 9.4

@art297:  I see what you are saying.  I marked as solved since the subject line is solved.  Once I realized uppercase µ should be M, I ended up with two new problems.

 

1. How to upcase everying but µ, which your first set of code with a tweek to only look at µ along with several other solutions from others posted here can address.

2. How to only change upcase µ back, which this code will handle nicely with a tweek to only change back for µ.

 

I'm relatively new to these boards so I'm not sure of the ettiqutte.  Should these two questions be posted in another string with an appropriate subject line so these can be found by others when searching?

 

Thank you for your time and effort on this topic!

PROC Star
Posts: 7,468

Re: KUPCASE changing "μ" to "M" in SAS 9.4

My only concern was that you had a solution to your problem. Yes, when a problem scope changes or expands it's always best to start a new thread. However, in this case, all apparently turned out well and you now have all you need for getting what you want.

 

Art, CEO, AnalystFinder.com

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 367 views
  • 3 likes
  • 6 in conversation