BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SanderEhmsen
Quartz | Level 8

 Hi all

 

In Denmark everybody is given a CPR-number at birth (social security number).

This is a UUID for a person. 

 

As by GDPR we need to scan our folders regulary to find if anybody by accident has put a CPR-number in a folder that is not documented. 

For the moment we would like to use SAS to do this task for us in my department. 

I have thus developed a macro-function that find allmost every instance of a CPR-number in our folders. Nevertheless I would like to improve my macro function using Perl Reg Ex. 

 

 

In google a colleague of mine has developed the following syntax

(\W|^)([0-3])\d{1}([0-1])\d{3}[\s-_]{0,1}\d{4}(\W|$)

 

So a cpr contains 10 characters. The first can be between ('0','1','2','3') the third can be ('0','1').

The rest can be any character between '0' and '9'. 

Sometimes a cpr-number is made of 11 characters where the 7th is a '-' (line).

 

I have a hard time understanding Perl Reg Ex in SAS. 

Can anyone reform the abovestanding google code so it will fit in SAS.

I have consultet the Tip Sheet (https://support.sas.com/rnd/base/datastep/perl_regexp/regexp-tip-sheet.pdf). 

 

i believe I should use the prxmatch-function. 

 

Best,

Sander. 

 

1 ACCEPTED SOLUTION

Accepted Solutions
gamotte
Rhodochrosite | Level 12

Hello,

 

Perl regex in SAS are ... Perl regex. So the expression you have found

can be used as is. You can use prxmatch to match string agains the expression as follows.

 

data _NULL_;
input str $50.;

if prxmatch("/(\W|^)([0-3])\d{1}([0-1])\d{3}[\s-_]{0,1}\d{4}(\W|$)/",str) then put "OK";
else put "KO";
cards;
kjhkj 0113816723 oiuoiuoiu
0123816723 3543434
0113816723 54556663p oiu
0113816723
oioiuoi 5113816723 oo
;
run;

View solution in original post

2 REPLIES 2
gamotte
Rhodochrosite | Level 12

Hello,

 

Perl regex in SAS are ... Perl regex. So the expression you have found

can be used as is. You can use prxmatch to match string agains the expression as follows.

 

data _NULL_;
input str $50.;

if prxmatch("/(\W|^)([0-3])\d{1}([0-1])\d{3}[\s-_]{0,1}\d{4}(\W|$)/",str) then put "OK";
else put "KO";
cards;
kjhkj 0113816723 oiuoiuoiu
0123816723 3543434
0113816723 54556663p oiu
0113816723
oioiuoi 5113816723 oo
;
run;
SanderEhmsen
Quartz | Level 8

Thank you for a quick, fun and precise reply.

 

My problem then simply was to understand I needed to put slashes first and last in the expression.

 

Thank you very much.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1339 views
  • 0 likes
  • 2 in conversation