Suppose I have one character variable with length = 1 and I want to check if the value is an alphabet only.
Below is my code.Let me know if you have handier code.
data t2;
set t1;
if var1 in ("A","B","C","D", "E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y", "Z") then ind = 1;
else ind = 0;
run;
Do you only want to check for uppercase characters? Otherwise you could use:
data t2;
set t1;
ind=anyalpha(var1);
run;
if you really are only interested in upper case characters, you could use:
data t2;
set t1;
ind=anyupper(var1);
run;
data t1;
input var1 $;
cards;
a
b
2
5
t
r
;
data t2;
set t1;
ind=ifn(lengthn(compress(var1,,'ka'))=1,1,0);
run;
proc print;run;
Another option is using regular expressions:
data t2;
set t1;
one_alpha_rx=prxparse("/[a-zA-Z]/");
ind=prxmatch(one_alpha_rx,x);
drop one_alpha_rx;
run;
For the problem you describe, you'd be better off using Art and Linlin's solutions - they're simpler and probably faster. But if you have to check more complex patterns some time, it's worth learning about regexp matching (this webform won't let me copy and paste, but there's a good paper on this in the SUGI29 archives).
As an example of where regexp comes in handy, I had an application where ID variables were expected to be twelve digits followed by an alpha character and then four more digits. To check whether inputs fit this rule, I used:
legal_pattern=prxparse("/\d{12}[a-zA-Z]\d{4}/");
Note that regexp matching doesn't check whether the variable EXACTLY matches the pattern defined, only whether it appears somewhere in there. But this isn't a problem if the length of the regexp exactly matches the length of the variable.
Hi,
A similar question was asked about specifying an alphabetic range:
https://communities.sas.com/thread/35517
HTH
Regards,
Amir.
Ind = AnyAlpha( Var1 ) ;
Or
Ind = PrxMatch( '/a-z/o' , Var1 ) ;
Either will do exactly what you want...... AnyAlpha will be slightly faster but isn't as portable given one has to know the translation table used for a specific set up...
Toby, have you tested that PrxMatch code? When I use that one I'm getting zeroes where they shouldn't be.
Geoffry,
Opps... I guess I shouldnt try to do too many things at the same time...the pattern modifier should have been a 'i' instead of 'o'.
It should have been:
Ind = PrxMatch( '/a-z/i' , Var1 ) ;
This will match a-z and A-Z characters as the 'i' pattern modifier tells the RegEx engine to use case insensitivity. There is no need for the 'o' pattern modifier since the pattern is explicitly and does not change. Which means SAS by default will only compile it once. Since Var1 is suppose to only be length of 1 the varibale does not need to be trimmed of any leading or trailing spaces.
If one only wanted to check for lower case letters then:
Ind = PrxMatch( '/a-z/' , Var1 ) ;
And for only upper case:
Ind = PrxMatch( '/A-Z/' , Var1 ) ;
Sorry for the fat fingered and lack of proofing on the last post.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.