data _null_;
v1 = "aYT;-/*A - تwsilيتيتddيت يlămâieD @يlămâieD D^)M";
v2=upcase(v1);
v3 = prxchange("s/[^-A-Z]/ /",-1,v2);
v4 = compress(v3, , 'kw');
put (_all_) (=/);
run;
I want to keep only English characters ,hyphen "-" , and Arabic characters .this code worked fine for the English characters and hyphen but it escape the Arabic characters , how can I keep the Arabic characters too !!
Moved and re-titled.
Did you try letter intervals like A-Z ?
Or even adding all letters to the character class [ ] ?
Also, if the Arabic letters are encoded using UTF-8, the "common" string functions (such as compress() or prxchange() ) will not work. You must use MBCS-aware functions, such as kcompress() . Not all single-byte functions have a multi-byte equivalent.
I found this note in the docs of prxmatch:
This function is assigned an I18N Level 0 status, and is designed for SBCS data. Do not use this function to process DBCS or MBCS data.
Same problem with compress, fortunately kcompress exists, so maybe this is what you need:
data _null_;
v1 = "aYT;-/*A - تwsilيتيتddيت يlămâieD @يlămâieD D^)M";
v2 = upcase(v1);
v3 = kcompress(v2, , 'kw');
put (_all_) (=/);
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.