SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tietokone42
Calcite | Level 5

Dear SAS Community,

 

I have a string variable in my data, which contains a mixture of different alphanumeric/special characters (really everything is possible). My task is to identify only special cases (=valid cases). Here is a minimal working example:

 

Example Data:

var1

Abc 123456

Abc 1234567

B aBc 123

A

AbC    123

aBC 1243

123 abc 123

aBc123

Abc 12345 Abc 12345

abc 345

 

I only would like to find the highlighted cases, starting with three letters (upper or lower case possible) and followed by numbers (one is required, a max. of six is possible). Between the letters and the numbers one space is allowed but not required. I have tried for example the following code:

 

DATA WORK.DATA_02;
   SET WORK.DATA_01;
      FOUND = 0;

      if PRXMATCH ("/^[Aa][Bb][Cc] ?[1-9][0-9]?[0-9]?[0-9]?[0-9]?[0-9]?$/",var1) > 0 then FOUND = 1;

RUN;

 

Beside that I have also tried the code without the $ at the end, with different boundaries \b and hundreds of other combinations. Unfortunately nothing works... The strange thing is that it seems that the code

^([Aa][Bb][Cc]\ ?[0-9][0-9]?[0-9]?[0-9]?[0-9]?[0-9]?)$

works in many online regex-checkers, but not within SAS.

 

Can anybody help? Any hints, ideas or solutions? I am really desperately looking for an answer since days...

 

Thank you very much in advance!

 

Best regards

Lars

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Note that [:alpha:] matches all letters in the collating sequence used, including for example é.

This might be better. Or not.

 prxmatch('/^[a-z]{3} ?\d{1,6}$/oi',strip(VAR))

 

View solution in original post

3 REPLIES 3
Patrick
Opal | Level 21

I believe below RegEx will meet your requirement.

You might find this link helpful.

data have;
  infile datalines truncover dlm='|' dsd;
  input select :$1. var1 :$40.;
  datalines;
1|Abc 123456
0|Abc 1234567
0|B aBc 123
0|A
0|AbC    123
1|aBC 1243
0|123 abc 123
1|aBc123
0|Abc 12345 Abc 12345
1|abc 345
;


/*I only would like to find the highlighted cases, starting with three letters (upper or lower case possible) */
/*and followed by numbers (one is required, a max. of six is possible). */
/*Between the letters and the numbers one space is allowed but not required.*/
data want;
  set have;
  selected= prxmatch('/^[[:alpha:]]{3} ?\d{1,6}$/oi',strip(var1))>0;
run;

Patrick_0-1629793870113.png

 

ChrisNZ
Tourmaline | Level 20

Note that [:alpha:] matches all letters in the collating sequence used, including for example é.

This might be better. Or not.

 prxmatch('/^[a-z]{3} ?\d{1,6}$/oi',strip(VAR))

 

tietokone42
Calcite | Level 5

Hi Patrick,

 

thank you very much for your fast reply and the additional link. This really helped me a lot and I think this is the solution! 

 

Thanks! 😀

 

Lars

sas-innovate-white.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Early bird rate extended! Save $200 when you sign up by March 31.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1234 views
  • 1 like
  • 3 in conversation