BookmarkSubscribeRSS Feed
SASPhile
Quartz | Level 8

How to set a flag if a string has nonEnglish character?

12 REPLIES 12
PeterClemmensen
Tourmaline | Level 20

Is a non-english character a character not in a-z?

SASPhile
Quartz | Level 8

Yes. It has some chinese charcters

PeterClemmensen
Tourmaline | Level 20

Do something like this

 

data have;
string="abc";output;
string="ab人物";output;
string="xyz";output;
run;

data want;
   set have;
   flag=ifn(notalpha(string),1,0);
run;
SASPhile
Quartz | Level 8

flag is set to 1 for all values

PeterClemmensen
Tourmaline | Level 20

Ok. Does this work for you?

 

data have;
string="abc";output;
string="ab人物";output;
string="xyz";output;
run;

data want;
   set have;
   flag=ifn(lengthn(compress(string, "abcdefghijklmnopqrstuvwxyz", "i")),1,0);
run;

 EDIT: I added an IFN Function to the code.

SASPhile
Quartz | Level 8

I cannot extend it to below, it fails:

 

 flag=lengthn(catx("-",string1, string2),"abcdefghijklmnopqrstuvwxyz", "i");

novinosrin
Tourmaline | Level 20

@PeterClemmensen , The code of yours can be tweaked to

 

data have;
string="abc";output;
string="ab人物";output;
string="xyz";output;
run;

data want;
   set have;
   flag=lengthn(compress(string, " ", "ai"))>0;
/*   flag=ifn(lengthn(compress(string, "abcdefghijklmnopqrstuvwxyz", "i")),1,0);*/
run;
PeterClemmensen
Tourmaline | Level 20

This solution however, relies on your OPTIONS LOCALE= System Option

 

Let me know if it does not meet your needs.

ybolduc
Quartz | Level 8

I think regular expression would be the easiest way to do it. Here is how I would do this:

 

data want;
  length mytext $100.;
  input mytext $;
  flag = ifn(prxmatch('/[^a-zA-Z0-9 ]/', mytext) > 0, 1, 0);
datalines;
Hello
Sébastien
;
run;

Patrick
Opal | Level 21

@ybolduc

You need to be careful which string function you're using as soon as it comes to dealing with multi byte characters. 

The PRX...() functions are unfortunately only good for single byte.

http://support.sas.com/documentation/cdl//en/nlsref/69741/HTML/default/viewer.htm#p1pca7vwjjwucin178...

PGStats
Opal | Level 21

Try this if you have National Language Support

 

data want;
   set have;
   flag= string ne basechar(string);
run;

Try it on a real sample of your data, not datalines text.

 

PG
ChrisNZ
Tourmaline | Level 20

This

 

if length(CHAR) ne klength(CHAR);

will detect any character that doesn't use single-byte encoding.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 2816 views
  • 3 likes
  • 7 in conversation