DATA Step, Macro, Functions and more

Can prxpattern and prxmatch be used with XSD patterns

Accepted Solution Solved
Reply
Regular Contributor
Posts: 155
Accepted Solution

Can prxpattern and prxmatch be used with XSD patterns

I need to validate data values before generating an XML file. The XML file specifies patterns for a number of the fields. But they are not standard PERL regex's. XSD patterns are apparently a limited subset and the syntax is slightly different.

 

If you plug an XSD pattern into prxmatch, it rejects the syntax.

I did find a reference that said you could take an XSD pattern and prefix it with a ^ and suffix it with a $ to convert it to a Perl regex. But prxpattern rejects that also. If you remove the $, it rejects it because it wants a close ^. Adding a ^ as the first and last character results in (for the examples I have checked) a valid pattern. But it seems to accept patterns that are not valid.

 

So does anybody have any advice on how to convert an XSD pattern to something that can be used in SAS.

 

TIA.


Accepted Solutions
Solution
‎05-05-2016 05:33 PM
Respected Advisor
Posts: 4,935

Re: Can prxpattern and prxmatch be used with XSD patterns

Seems like all you need to add is a pair of delimiters

 

\d{5}   -->  /\d{5}/  will match a substring of 5 digits

 

or 

 

\d{5} --> /^\d{5}$/ will match if the whole string is 5 digits

PG

View solution in original post


All Replies
Respected Advisor
Posts: 4,935

Re: Can prxpattern and prxmatch be used with XSD patterns

Give examples of XSD patterns.

PG
Regular Contributor
Posts: 155

Re: Can prxpattern and prxmatch be used with XSD patterns

[ Edited ]

Meant to include a representative set. Thanks for the reminder. Here is a subset from just one of the XSD files.


[A-Z\d\._'\-]+@[A-Z\d_'\-]+\.[A-Z\d\._'\-]+
[A-Z]{2}
[A-Z]{4}\d{6}[MH][A-Z]{5}[0-9]{2}
[A-ZÑ ]{1,200}
[A-ZÑ&]{3,4}\d{6}[A-Z0-9]{3}
[A-ZÑ&]{3}\d{6}[A-Z0-9]{3}
[A-ZÑ&]{4}\d{6}[A-Z0-9]{3}
[A-ZÑ0-9]{1,14}
[A-ZÑ\d #\-\.&,_@'()]{1,254}
[A-ZÑ\d \-\.,':/$]{1,3000}
[A-ZÑ\d \-\.,:/]{1,100}
[A-ZÑ\d \-_\.&,'#@]{1,200}
\d{1,14}\.\d{2}
\d{1,2}
\d{4}[0|1]\d{1}
\d{4}\-\d{1,9}
\d{5}

 

Found this site that discusses the differences. I tried the suggestion to prefix the pattern with a ^ and suffix it with a $. But that did not create an expression that prxparse accepted.

Also found a few sites that decode the pattern into a description. From which I could presumably create a valid perl regex expression. But given how many of these I have to create, I would prefer to avoid that approach if at all possible.

Solution
‎05-05-2016 05:33 PM
Respected Advisor
Posts: 4,935

Re: Can prxpattern and prxmatch be used with XSD patterns

Seems like all you need to add is a pair of delimiters

 

\d{5}   -->  /\d{5}/  will match a substring of 5 digits

 

or 

 

\d{5} --> /^\d{5}$/ will match if the whole string is 5 digits

PG
Regular Contributor
Posts: 155

Re: Can prxpattern and prxmatch be used with XSD patterns

Very helpful.  Thx.  So let me first acknowledge that to say I am a novice on regex patterns would give me too much credit.

So, am I correct in assuming that prefixing with /^ and suffixing with $/ will check for an exact match. So for example, 12345x, will fail because it is not an exact match?

Respected Advisor
Posts: 4,935

Re: Can prxpattern and prxmatch be used with XSD patterns

Right! "12345   " will not match either because of the trailing spaces, but trim("12345   ") will match. 

PG
Regular Contributor
Posts: 155

Re: Can prxpattern and prxmatch be used with XSD patterns

Thanks. I had already thought about that and was doing a strip of the string.

PROC Star
Posts: 1,760

Re: Can prxpattern and prxmatch be used with XSD patterns

Note that if you want to match accented letters, the pattern

[A-ZÑ0-9]{1,14}

 

can be extended using a posix character class

[[:upper:]0-9]{1,14}

 

if you need to catch other accented letters.

The letters matched depend on the encoding. For example wlatin1 matches most Western Europe accents like Ñ (Spanish) or Ø (Swedish).

 

Taken from http://www.amazon.com/High-Performance-SAS-Coding-Christian-Graffeuille/dp/1512397490 

Regular Contributor
Posts: 155

Re: Can prxpattern and prxmatch be used with XSD patterns

Thanks. For now this project is using UTF-8 encoding and we only need to support Spanish. But this is a good tip.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 342 views
  • 0 likes
  • 3 in conversation