data _null_;
v1 = "aYT;-/*A - تwsilيتيتddيت يlămâieD @يlămâieD D^)M";
v2=upcase(v1);
v3 = prxchange("s/[^-A-Z]/ /",-1,v2);
v4 = compress(v3, , 'kw');
put (_all_) (=/);
run;
I want to keep only English characters ,hyphen "-" , and Arabic characters .this code worked fine for the English characters and hyphen but it escape the Arabic characters , how can I keep the Arabic characters too !!
Moved and re-titled.
Did you try letter intervals like A-Z ?
Or even adding all letters to the character class [ ] ?
Also, if the Arabic letters are encoded using UTF-8, the "common" string functions (such as compress() or prxchange() ) will not work. You must use MBCS-aware functions, such as kcompress() . Not all single-byte functions have a multi-byte equivalent.
I found this note in the docs of prxmatch:
This function is assigned an I18N Level 0 status, and is designed for SBCS data. Do not use this function to process DBCS or MBCS data.
Same problem with compress, fortunately kcompress exists, so maybe this is what you need:
data _null_;
v1 = "aYT;-/*A - تwsilيتيتddيت يlămâieD @يlămâieD D^)M";
v2 = upcase(v1);
v3 = kcompress(v2, , 'kw');
put (_all_) (=/);
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.