BookmarkSubscribeRSS Feed
ThomasFrederiksen
SAS Employee

Good morning everyone

 

Do you know that you actually can use regular expressions in SAS code?

Well you can, and some of us do. It’s handy when you need to do complex string operations.

 

There is a number of SAS functions and call routines you can use from plain SAS code (view the links section).

Here is an example where we want to find the position of the first digit. Use PRXPARSE to create and compile the regex. Use PRXMATCH to use the regex to find the first position.

 

data thomas1;

   retain myREGEX;

   if _N_=1 then do;

     myREGEX = prxparse("/\d/");

     spyf1='You have won 100 kr.';

     spyf2=prxmatch(myregex,spyf1);

     put 'Then first digit is at position #' spyf2;

   end;

run;

 

 

Here is an example where we want to replace parts of the string if we find what we are looking for. We are looking for any digit and the rest of the string. Use PRXPARSE to create and compile the regex. Use PRXCHANGE to use the regex to make the replacement.

 

data thomas2;

   retain myREGEX;

   if _N_=1 then do;

     myREGEX = prxparse("s/\d.*/zero,zip,nothing at all/");

     spyf1='you have won 100 kr.';

     spyf2=prxchange(myregex,1,spyf1);

     put 'Fooled you as ' spyf2;

   end;

run;

 

 

In Data Management Studio it’s almost impossible to avoid using regular expressions. E.g. we use it when we work with the QKB, and we do that a lot in DM Studio.

 

Don’t be afraid to start using regular expressions. Most of challenges you meet and the syntax you need can be googled. It’s used massively throughout the world, and you will find many replies on the internet.

 

Links

SAS paper

http://www2.sas.com/proceedings/sugi29/043-29.pdf

 

SAS documentation - SAS functions and call routines

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002295977.htm

 

Build and test your regex on the fly (very useful site)

https://regex101.com/

 

Find and learn everything about regex

http://www.regular-expressions.info/

1 REPLY 1
OleSteen
SAS Employee

Thank you for a great "Juletip" Thomas.

Amazing what can be done with just one function!

 

I just recently found this example, which might be usefull for someone else 🙂

It can be used for validating if 10 digits is a valid Danish social security id (CPRNR)

 

data ValidCpr;
input cpr $10.;
valid=prxmatch('/^(?:(?:31(?:0[13578]|1[02])|(?:30|29)(?:0[13-9]|1[0-2])|(?:0[1-9]|1[0-9]|2[0-8])(?:0[1-9]|1[0-2]))[0-9]{3}|290200[4-9]|2902(?:(?!00)[02468][048]|[13579][26])[0-3])[0-9]{3}$/',cpr);
               /* ^ = beginning of text
                  Max 31 for month 1,3,5,7,8,10,12
                  29 in february, but only if leap year
                  29 in february, but only if digt-7 is between 4 and 9 
                  . . . . 
                  $ = end of text
               */
cards;
3101011234
2902001234
2902004567
3004475364
3104475364
;
run;

 

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Discussion stats
  • 1 reply
  • 1667 views
  • 16 likes
  • 2 in conversation