data _null_;
v1 = "aYT;-/*A - تwsilيتيتddيت يlămâieD @يlămâieD D^)M";
v2=upcase(v1);
v3 = prxchange("s/[^-A-Z]/ /",-1,v2);
v4 = compress(v3, , 'kw');
put (_all_) (=/);
run;
I want to keep only English characters ,hyphen "-" , and Arabic characters .this code worked fine for the English characters and hyphen but it escape the Arabic characters , how can I keep the Arabic characters too !!
Moved and re-titled.
Did you try letter intervals like A-Z ?
Or even adding all letters to the character class [ ] ?
Also, if the Arabic letters are encoded using UTF-8, the "common" string functions (such as compress() or prxchange() ) will not work. You must use MBCS-aware functions, such as kcompress() . Not all single-byte functions have a multi-byte equivalent.
I found this note in the docs of prxmatch:
This function is assigned an I18N Level 0 status, and is designed for SBCS data. Do not use this function to process DBCS or MBCS data.
Same problem with compress, fortunately kcompress exists, so maybe this is what you need:
data _null_;
v1 = "aYT;-/*A - تwsilيتيتddيت يlămâieD @يlămâieD D^)M";
v2 = upcase(v1);
v3 = kcompress(v2, , 'kw');
put (_all_) (=/);
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.