BookmarkSubscribeRSS Feed
SASPhile
Quartz | Level 8

How to set a flag if a string has nonEnglish character?

12 REPLIES 12
PeterClemmensen
Tourmaline | Level 20

Is a non-english character a character not in a-z?

SASPhile
Quartz | Level 8

Yes. It has some chinese charcters

PeterClemmensen
Tourmaline | Level 20

Do something like this

 

data have;
string="abc";output;
string="ab人物";output;
string="xyz";output;
run;

data want;
   set have;
   flag=ifn(notalpha(string),1,0);
run;
SASPhile
Quartz | Level 8

flag is set to 1 for all values

PeterClemmensen
Tourmaline | Level 20

Ok. Does this work for you?

 

data have;
string="abc";output;
string="ab人物";output;
string="xyz";output;
run;

data want;
   set have;
   flag=ifn(lengthn(compress(string, "abcdefghijklmnopqrstuvwxyz", "i")),1,0);
run;

 EDIT: I added an IFN Function to the code.

SASPhile
Quartz | Level 8

I cannot extend it to below, it fails:

 

 flag=lengthn(catx("-",string1, string2),"abcdefghijklmnopqrstuvwxyz", "i");

novinosrin
Tourmaline | Level 20

@PeterClemmensen , The code of yours can be tweaked to

 

data have;
string="abc";output;
string="ab人物";output;
string="xyz";output;
run;

data want;
   set have;
   flag=lengthn(compress(string, " ", "ai"))>0;
/*   flag=ifn(lengthn(compress(string, "abcdefghijklmnopqrstuvwxyz", "i")),1,0);*/
run;
PeterClemmensen
Tourmaline | Level 20

This solution however, relies on your OPTIONS LOCALE= System Option

 

Let me know if it does not meet your needs.

ybolduc
Quartz | Level 8

I think regular expression would be the easiest way to do it. Here is how I would do this:

 

data want;
  length mytext $100.;
  input mytext $;
  flag = ifn(prxmatch('/[^a-zA-Z0-9 ]/', mytext) > 0, 1, 0);
datalines;
Hello
Sébastien
;
run;

Patrick
Opal | Level 21

@ybolduc

You need to be careful which string function you're using as soon as it comes to dealing with multi byte characters. 

The PRX...() functions are unfortunately only good for single byte.

http://support.sas.com/documentation/cdl//en/nlsref/69741/HTML/default/viewer.htm#p1pca7vwjjwucin178...

PGStats
Opal | Level 21

Try this if you have National Language Support

 

data want;
   set have;
   flag= string ne basechar(string);
run;

Try it on a real sample of your data, not datalines text.

 

PG
ChrisNZ
Tourmaline | Level 20

This

 

if length(CHAR) ne klength(CHAR);

will detect any character that doesn't use single-byte encoding.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1824 views
  • 3 likes
  • 7 in conversation