BookmarkSubscribeRSS Feed
EmmettMicah
Calcite | Level 5

I received SAS code that contains a regular expression and I'm having a hard time understanding it. I need to make some edits to it (output isn't correct), but first I need to know what exactly what it is saying. I've used cheat sheets but it's hard for me to put the whole expression together. Any help would be appreciated!

PRXPARSE("/((\D{1,2}.?\D{0,2}\S?) (X) ?\S?) {1,3} (MM|CM)/I")

2 REPLIES 2
PGStats
Opal | Level 21

Are you sure that the original pattern contains only uppercase characters?

PG
r_behata
Barite | Level 11

Here is some Info :

 

((\D{1,2}.?\D{0,2}\S?) (X) \S?) {1,3} (MM|CM)

1st Capturing Group ((\D{1,2}.?\D{0,2}\S?) (X) \S?)
2nd Capturing Group (\D{1,2}.?\D{0,2}\S?)
\D{1,2} matches any character that\'s not a digit (equal to [^0-9])
{1,2} Quantifier — Matches between 1 and 2 times, as many times as possible, giving back as needed (greedy)
.? matches any character (except for line terminators)
? Quantifier — Matches between zero and one times, as many times as possible, giving back as needed (greedy)
\D{0,2} matches any character that\'s not a digit (equal to [^0-9])
{0,2} Quantifier — Matches between 0 and 2 times, as many times as possible, giving back as needed (greedy)
\S? matches any non-whitespace character (equal to [^\r\n\t\f\v ])
? Quantifier — Matches between zero and one times, as many times as possible, giving back as needed (greedy)
matches the character literally (case sensitive)
3rd Capturing Group (X)
X matches the character X literally (case sensitive)
matches the character literally (case sensitive)
\S? matches any non-whitespace character (equal to [^\r\n\t\f\v ])
? Quantifier — Matches between zero and one times, as many times as possible, giving back as needed (greedy)
matches the character literally (case sensitive)
{1,3} matches the character literally (case sensitive)
{1,3} Quantifier — Matches between 1 and 3 times, as many times as possible, giving back as needed (greedy)
matches the character literally (case sensitive)
4th Capturing Group (MM|CM)
1st Alternative MM
MM matches the characters MM literally (case sensitive)
2nd Alternative CM
CM matches the characters CM literally (case sensitive)

 

Possible Matches :

 

TIM AP X X   MM
ABC AP X X   CM
SAS AP X X   MM
SASSAP X X   MM
SAS IS X T   CM
SAS IS X     MM
ABC XY X Y   CM
      X B    MM