SAS Programming

kurofufu · Posted 08-08-2012 01:29 PM

Suppose I have one character variable with length = 1 and I want to check if the value is an alphabet only.

Below is my code.Let me know if you have handier code.

data t2;

set t1;

if var1 in ("A","B","C","D", "E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y", "Z") then ind = 1;

else ind = 0;

run;

art297 · Posted 08-08-2012 01:41 PM

Do you only want to check for uppercase characters? Otherwise you could use:

data t2;

set t1;

ind=anyalpha(var1);

run;

if you really are only interested in upper case characters, you could use:

data t2;

set t1;

ind=anyupper(var1);

run;

Linlin · Posted 08-08-2012 01:42 PM

data t1;
input var1 $;
cards;
a
b
2
5
t
r
;
data t2;

set t1;

ind=ifn(lengthn(compress(var1,,'ka'))=1,1,0);

run;
proc print;run;

GeoffreyBrent · Posted 08-08-2012 08:55 PM

Another option is using regular expressions:

data t2;

set t1;

one_alpha_rx=prxparse("/[a-zA-Z]/");

ind=prxmatch(one_alpha_rx,x);

drop one_alpha_rx;

run;

For the problem you describe, you'd be better off using Art and Linlin's solutions - they're simpler and probably faster. But if you have to check more complex patterns some time, it's worth learning about regexp matching (this webform won't let me copy and paste, but there's a good paper on this in the SUGI29 archives).

As an example of where regexp comes in handy, I had an application where ID variables were expected to be twelve digits followed by an alpha character and then four more digits. To check whether inputs fit this rule, I used:

legal_pattern=prxparse("/\d{12}[a-zA-Z]\d{4}/");

Note that regexp matching doesn't check whether the variable EXACTLY matches the pattern defined, only whether it appears somewhere in there. But this isn't a problem if the length of the regexp exactly matches the length of the variable.

Amir · Posted 08-09-2012 08:24 AM

Hi,

A similar question was asked about specifying an alphabetic range:

https://communities.sas.com/thread/35517

HTH

Regards,

Amir.

TobyDunn_hotmail_com · Posted 08-09-2012 09:28 AM

kurofufu,

Ind = AnyAlpha( Var1 ) ;

Or

Ind = PrxMatch( '/a-z/o' , Var1 ) ;

Either will do exactly what you want...... AnyAlpha will be slightly faster but isn't as portable given one has to know the translation table used for a specific set up...

GeoffreyBrent · Posted 08-09-2012 09:50 PM

Toby, have you tested that PrxMatch code? When I use that one I'm getting zeroes where they shouldn't be.

TobyDunn_hotmail_com · Posted 08-10-2012 10:51 AM

Geoffry,

Opps... I guess I shouldnt try to do too many things at the same time...the pattern modifier should have been a 'i' instead of 'o'.

It should have been:

Ind = PrxMatch( '/a-z/i' , Var1 ) ;

This will match a-z and A-Z characters as the 'i' pattern modifier tells the RegEx engine to use case insensitivity. There is no need for the 'o' pattern modifier since the pattern is explicitly and does not change. Which means SAS by default will only compile it once. Since Var1 is suppose to only be length of 1 the varibale does not need to be trimmed of any leading or trailing spaces.

If one only wanted to check for lower case letters then:

Ind = PrxMatch( '/a-z/' , Var1 ) ;

And for only upper case:

Ind = PrxMatch( '/A-Z/' , Var1 ) ;

Sorry for the fat fingered and lack of proofing on the last post.

SAS Programming

check if a character is an alphabet

Re: check if a character is an alphabet

Re: check if a character is an alphabet

Re: check if a character is an alphabet

Re: check if a character is an alphabet

Re: check if a character is an alphabet

Re: check if a character is an alphabet

Re: check if a character is an alphabet

Follow Us

What is...

SAS Programming

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...