- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Suppose I have one character variable with length = 1 and I want to check if the value is an alphabet only.
Below is my code.Let me know if you have handier code.
data t2;
set t1;
if var1 in ("A","B","C","D", "E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y", "Z") then ind = 1;
else ind = 0;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Do you only want to check for uppercase characters? Otherwise you could use:
data t2;
set t1;
ind=anyalpha(var1);
run;
if you really are only interested in upper case characters, you could use:
data t2;
set t1;
ind=anyupper(var1);
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data t1;
input var1 $;
cards;
a
b
2
5
t
r
;
data t2;
set t1;
ind=ifn(lengthn(compress(var1,,'ka'))=1,1,0);
run;
proc print;run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Another option is using regular expressions:
data t2;
set t1;
one_alpha_rx=prxparse("/[a-zA-Z]/");
ind=prxmatch(one_alpha_rx,x);
drop one_alpha_rx;
run;
For the problem you describe, you'd be better off using Art and Linlin's solutions - they're simpler and probably faster. But if you have to check more complex patterns some time, it's worth learning about regexp matching (this webform won't let me copy and paste, but there's a good paper on this in the SUGI29 archives).
As an example of where regexp comes in handy, I had an application where ID variables were expected to be twelve digits followed by an alpha character and then four more digits. To check whether inputs fit this rule, I used:
legal_pattern=prxparse("/\d{12}[a-zA-Z]\d{4}/");
Note that regexp matching doesn't check whether the variable EXACTLY matches the pattern defined, only whether it appears somewhere in there. But this isn't a problem if the length of the regexp exactly matches the length of the variable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
A similar question was asked about specifying an alphabetic range:
https://communities.sas.com/thread/35517
HTH
Regards,
Amir.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Ind = AnyAlpha( Var1 ) ;
Or
Ind = PrxMatch( '/a-z/o' , Var1 ) ;
Either will do exactly what you want...... AnyAlpha will be slightly faster but isn't as portable given one has to know the translation table used for a specific set up...
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Toby, have you tested that PrxMatch code? When I use that one I'm getting zeroes where they shouldn't be.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Geoffry,
Opps... I guess I shouldnt try to do too many things at the same time...the pattern modifier should have been a 'i' instead of 'o'.
It should have been:
Ind = PrxMatch( '/a-z/i' , Var1 ) ;
This will match a-z and A-Z characters as the 'i' pattern modifier tells the RegEx engine to use case insensitivity. There is no need for the 'o' pattern modifier since the pattern is explicitly and does not change. Which means SAS by default will only compile it once. Since Var1 is suppose to only be length of 1 the varibale does not need to be trimmed of any leading or trailing spaces.
If one only wanted to check for lower case letters then:
Ind = PrxMatch( '/a-z/' , Var1 ) ;
And for only upper case:
Ind = PrxMatch( '/A-Z/' , Var1 ) ;
Sorry for the fat fingered and lack of proofing on the last post.