BookmarkSubscribeRSS Feed
SASPhile
Quartz | Level 8

How to set a flag if a string has nonEnglish character?

12 REPLIES 12
PeterClemmensen
Tourmaline | Level 20

Is a non-english character a character not in a-z?

SASPhile
Quartz | Level 8

Yes. It has some chinese charcters

PeterClemmensen
Tourmaline | Level 20

Do something like this

 

data have;
string="abc";output;
string="ab人物";output;
string="xyz";output;
run;

data want;
   set have;
   flag=ifn(notalpha(string),1,0);
run;
SASPhile
Quartz | Level 8

flag is set to 1 for all values

PeterClemmensen
Tourmaline | Level 20

Ok. Does this work for you?

 

data have;
string="abc";output;
string="ab人物";output;
string="xyz";output;
run;

data want;
   set have;
   flag=ifn(lengthn(compress(string, "abcdefghijklmnopqrstuvwxyz", "i")),1,0);
run;

 EDIT: I added an IFN Function to the code.

SASPhile
Quartz | Level 8

I cannot extend it to below, it fails:

 

 flag=lengthn(catx("-",string1, string2),"abcdefghijklmnopqrstuvwxyz", "i");

novinosrin
Tourmaline | Level 20

@PeterClemmensen , The code of yours can be tweaked to

 

data have;
string="abc";output;
string="ab人物";output;
string="xyz";output;
run;

data want;
   set have;
   flag=lengthn(compress(string, " ", "ai"))>0;
/*   flag=ifn(lengthn(compress(string, "abcdefghijklmnopqrstuvwxyz", "i")),1,0);*/
run;
PeterClemmensen
Tourmaline | Level 20

This solution however, relies on your OPTIONS LOCALE= System Option

 

Let me know if it does not meet your needs.

ybolduc
Quartz | Level 8

I think regular expression would be the easiest way to do it. Here is how I would do this:

 

data want;
  length mytext $100.;
  input mytext $;
  flag = ifn(prxmatch('/[^a-zA-Z0-9 ]/', mytext) > 0, 1, 0);
datalines;
Hello
Sébastien
;
run;

Patrick
Opal | Level 21

@ybolduc

You need to be careful which string function you're using as soon as it comes to dealing with multi byte characters. 

The PRX...() functions are unfortunately only good for single byte.

http://support.sas.com/documentation/cdl//en/nlsref/69741/HTML/default/viewer.htm#p1pca7vwjjwucin178...

PGStats
Opal | Level 21

Try this if you have National Language Support

 

data want;
   set have;
   flag= string ne basechar(string);
run;

Try it on a real sample of your data, not datalines text.

 

PG
ChrisNZ
Tourmaline | Level 20

This

 

if length(CHAR) ne klength(CHAR);

will detect any character that doesn't use single-byte encoding.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 2666 views
  • 3 likes
  • 7 in conversation