Hi,
I am doing some validity tests using pattern matching.
The issue is that the valid values I want not to be flagged can be either a 4-digit string or a 5-digit string.
This is what I set up:
I'm not getting any bad messages, but I'm seeing values of 4-digits flagged as not valid so obviously that part of the pattern matching is not working correctly.
If that is the complete list of possible values just use the LENGTH() function to find the 4 and 5 digit strings.
valid = length(string) in (4:5);
If you also have values like 'FRED' then perhaps include a VERIFY() function call also.
valid = (not verify(trim(string),'0123456789')) and length(string) in (4:5) ;
Where is the example values? Where are the examples of what is supposed to found as a pattern.
"The issue is that the valid values I want not to be flagged can be either a 4-digit string or a 5-digit string." Has no details of what the pattern you are looking for actually is.
@Walternate wrote:
Hi,
I am doing some validity tests using pattern matching.
The issue is that the valid values I want not to be flagged can be either a 4-digit string or a 5-digit string.
This is what I set up:
if _n_ = 1then do;pattern_id3n=prxparse("/\d{5}/i");pattern_id3nb=prxparse("/\d{4}/i");end;retain pattern:;if (prxmatch(pattern_id3n, id3n) ne 1) or (prxmatch(pattern_id3nb, id3n) ne 1) then do;bad_id3n=id3n;bad_id_flag = 1;end;
I'm not getting any bad messages, but I'm seeing values of 4-digits flagged as not valid so obviously that part of the pattern matching is not working correctly.
Valid:
1234
12345
Invalid:
1
12
123
123456
1234567
etc.
So anything 4-digits or 5-digits is valid but nothing else is.
If that is the complete list of possible values just use the LENGTH() function to find the 4 and 5 digit strings.
valid = length(string) in (4:5);
If you also have values like 'FRED' then perhaps include a VERIFY() function call also.
valid = (not verify(trim(string),'0123456789')) and length(string) in (4:5) ;
Why not say that in the regular expression?
Use ^ to indicate it has to start at the beginning. Add space and *$ at the end to indicate it can only be followed by the spaces used to pad the string to the full variable length.
data test;
if _n_ = 1 then do;
pattern_id3n=prxparse("/^\d{5} *$/");
pattern_id3nb=prxparse("/^\d{4} *$/");
end;
retain pattern: ;
drop pattern: ;
input string $char20. ;
test5=prxmatch(pattern_id3n,string);
test4=prxmatch(pattern_id3nb,string);
cards;
123
1234
12345
123456
12 456
xxx 12345 yy
;
Result
Obs string test5 test4 1 123 0 0 2 1234 0 1 3 12345 1 0 4 123456 0 0 5 12 456 0 0 6 xxx 12345 yy 0 0
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.