- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am doing some validity tests using pattern matching.
The issue is that the valid values I want not to be flagged can be either a 4-digit string or a 5-digit string.
This is what I set up:
I'm not getting any bad messages, but I'm seeing values of 4-digits flagged as not valid so obviously that part of the pattern matching is not working correctly.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If that is the complete list of possible values just use the LENGTH() function to find the 4 and 5 digit strings.
valid = length(string) in (4:5);
If you also have values like 'FRED' then perhaps include a VERIFY() function call also.
valid = (not verify(trim(string),'0123456789')) and length(string) in (4:5) ;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Where is the example values? Where are the examples of what is supposed to found as a pattern.
"The issue is that the valid values I want not to be flagged can be either a 4-digit string or a 5-digit string." Has no details of what the pattern you are looking for actually is.
@Walternate wrote:
Hi,
I am doing some validity tests using pattern matching.
The issue is that the valid values I want not to be flagged can be either a 4-digit string or a 5-digit string.
This is what I set up:
if _n_ = 1then do;pattern_id3n=prxparse("/\d{5}/i");pattern_id3nb=prxparse("/\d{4}/i");end;retain pattern:;if (prxmatch(pattern_id3n, id3n) ne 1) or (prxmatch(pattern_id3nb, id3n) ne 1) then do;bad_id3n=id3n;bad_id_flag = 1;end;
I'm not getting any bad messages, but I'm seeing values of 4-digits flagged as not valid so obviously that part of the pattern matching is not working correctly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Valid:
1234
12345
Invalid:
1
12
123
123456
1234567
etc.
So anything 4-digits or 5-digits is valid but nothing else is.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If that is the complete list of possible values just use the LENGTH() function to find the 4 and 5 digit strings.
valid = length(string) in (4:5);
If you also have values like 'FRED' then perhaps include a VERIFY() function call also.
valid = (not verify(trim(string),'0123456789')) and length(string) in (4:5) ;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Why not say that in the regular expression?
Use ^ to indicate it has to start at the beginning. Add space and *$ at the end to indicate it can only be followed by the spaces used to pad the string to the full variable length.
data test;
if _n_ = 1 then do;
pattern_id3n=prxparse("/^\d{5} *$/");
pattern_id3nb=prxparse("/^\d{4} *$/");
end;
retain pattern: ;
drop pattern: ;
input string $char20. ;
test5=prxmatch(pattern_id3n,string);
test4=prxmatch(pattern_id3nb,string);
cards;
123
1234
12345
123456
12 456
xxx 12345 yy
;
Result
Obs string test5 test4 1 123 0 0 2 1234 0 1 3 12345 1 0 4 123456 0 0 5 12 456 0 0 6 xxx 12345 yy 0 0