BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SanderEhmsen
Quartz | Level 8

 Hi all

 

In Denmark everybody is given a CPR-number at birth (social security number).

This is a UUID for a person. 

 

As by GDPR we need to scan our folders regulary to find if anybody by accident has put a CPR-number in a folder that is not documented. 

For the moment we would like to use SAS to do this task for us in my department. 

I have thus developed a macro-function that find allmost every instance of a CPR-number in our folders. Nevertheless I would like to improve my macro function using Perl Reg Ex. 

 

 

In google a colleague of mine has developed the following syntax

(\W|^)([0-3])\d{1}([0-1])\d{3}[\s-_]{0,1}\d{4}(\W|$)

 

So a cpr contains 10 characters. The first can be between ('0','1','2','3') the third can be ('0','1').

The rest can be any character between '0' and '9'. 

Sometimes a cpr-number is made of 11 characters where the 7th is a '-' (line).

 

I have a hard time understanding Perl Reg Ex in SAS. 

Can anyone reform the abovestanding google code so it will fit in SAS.

I have consultet the Tip Sheet (https://support.sas.com/rnd/base/datastep/perl_regexp/regexp-tip-sheet.pdf). 

 

i believe I should use the prxmatch-function. 

 

Best,

Sander. 

 

1 ACCEPTED SOLUTION

Accepted Solutions
gamotte
Rhodochrosite | Level 12

Hello,

 

Perl regex in SAS are ... Perl regex. So the expression you have found

can be used as is. You can use prxmatch to match string agains the expression as follows.

 

data _NULL_;
input str $50.;

if prxmatch("/(\W|^)([0-3])\d{1}([0-1])\d{3}[\s-_]{0,1}\d{4}(\W|$)/",str) then put "OK";
else put "KO";
cards;
kjhkj 0113816723 oiuoiuoiu
0123816723 3543434
0113816723 54556663p oiu
0113816723
oioiuoi 5113816723 oo
;
run;

View solution in original post

2 REPLIES 2
gamotte
Rhodochrosite | Level 12

Hello,

 

Perl regex in SAS are ... Perl regex. So the expression you have found

can be used as is. You can use prxmatch to match string agains the expression as follows.

 

data _NULL_;
input str $50.;

if prxmatch("/(\W|^)([0-3])\d{1}([0-1])\d{3}[\s-_]{0,1}\d{4}(\W|$)/",str) then put "OK";
else put "KO";
cards;
kjhkj 0113816723 oiuoiuoiu
0123816723 3543434
0113816723 54556663p oiu
0113816723
oioiuoi 5113816723 oo
;
run;
SanderEhmsen
Quartz | Level 8

Thank you for a quick, fun and precise reply.

 

My problem then simply was to understand I needed to put slashes first and last in the expression.

 

Thank you very much.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 1121 views
  • 0 likes
  • 2 in conversation