Help using Base SAS procedures

Perl expression

Accepted Solution Solved
Reply
Super Contributor
Posts: 297
Accepted Solution

Perl expression

Hello:

 

Could someone explain what are (?<!\d) and  d{4}(?!\d) meaning in the following codes?  Thanks.

 

http://analytics.ncsu.edu/sesug/2012/CT-03.pdf

 

 

data SSN ;

input SSN $20. ;

datalines ;

123-54-2280

#987-65-4321

S.S. 666-77-8888

246801357

soc # 133-77-2000

ssnum 888_22-7779

919-555-4689

call me 1800123456

;

run ;

proc sql feedback ;

select ssn

, prxchange( ‘s/(?<!\d)\d{3}[-_]?\d{2}[-_]?\d{4}(?!\d)/xxxxxxxxx/io’, -1, ssn)

as ssn2

from ssn

;

quit ;

 


Accepted Solutions
Solution
‎06-23-2017 02:11 PM
Community Manager
Posts: 2,768

Re: Perl expression

Here's how you can find out.  

 

Visit https://regex101.com.

 

Paste in the main expression:

 

 

(?<!\d)\d{3}[-_]?\d{2}[-_]?\d{4}(?!\d)

 

See the Explanation field.

 

regexexp.png

 

 

 

View solution in original post


All Replies
Super User
Posts: 10,540

Re: Perl expression

Do you have any of the SAS documentation available?

Super Contributor
Posts: 297

Re: Perl expression

From the link I provide below, it states that

 

The negative look-behind (i.e. (?<!\d) ) and negative look-ahead assertions (i.e. (?!\d) ) are non-capturing.

 

However, I still don't get what the code means.

Solution
‎06-23-2017 02:11 PM
Community Manager
Posts: 2,768

Re: Perl expression

Here's how you can find out.  

 

Visit https://regex101.com.

 

Paste in the main expression:

 

 

(?<!\d)\d{3}[-_]?\d{2}[-_]?\d{4}(?!\d)

 

See the Explanation field.

 

regexexp.png

 

 

 

Super Contributor
Posts: 297

Re: Perl expression

That is awesome tool! Thank you so much, Chris.

PROC Star
Posts: 259

Re: Perl expression

(?<!\d) --negative look behind

(?!\d) --negative lookahead

This values are used to check whether a particular characters are present along with other  characters and hence they are called as zero value assertions.

 

?< symbol indicates look behind, which is to check wether a particular value is there  in front of the value you are looking for

! symbol represents not equal to AKA negative

Let us use easier example. if you values goathair, cowhair, cowsomething and you want to find value hair

and also do not want to have goat infront of hair then you something like this (?<!goat)hair

 

(?<!\d) indicates that you donot want a digit(\d) in the front of your expression given

 

 

?! symbol indicates lookahead, which is to check wether a particular value is there  in back of the value you are looking for

! symbol represents not equal to AKA negative

Let us take an easier example. if you want to find value of  hairyboy goodboy hairykid and you want to find variable with hair

and also do not want to pick hairyboy but the value should have hairy infront then  you do something like this hairy(?!boy)

 

 

\d{4}(?!\d)  --- \d{4} means a digit repeated 4 times after value and there is no digit after that.

 

I tried my best to explain, please let me know if something is unclear, confusing or wrong

Super Contributor
Posts: 297

Re: Perl expression

Thank you very much for your detail explaination, Kiranv_

Regular Contributor
Posts: 234

Re: Perl expression

@kiranv_ can you explain this?

 

been*? is matching to "bee". However, been+? is matching to "been".

 

I know *? matches previous elenment zero or more times. I thought been*? should match to "been".

 

Thanks!

PROC Star
Posts: 259

Re: Perl expression

?  has difference in meaning when it is infront of + and *

+ and * are greedy. It can match condition as long as there is possibility

 

say the word is "beennnnnnnnnn" ,  if you say been+ (+ is 1 or more) then it can  match "beennnnnnnnnn"

? infront of + and * makes it non greedy.

 

+? and *? are non greedy.  Non greedy means the search stops one it fulfills the least condition. for +? it is one  and for *? it is zero)

 

say the word is "beennnnnnnnnn" and if you instruct regex engine to find  been+?  then it will pick ups "been" (n is there only once . non greedy here means once it picks first n it stops)

 

say the word is is "beennnnnnnnnn"if you instruct regex engine to find been*? it picks up "bee"(n can be zero and it stops immediately as non greedy. non greedy here means it can stop without seeing it)

 

 

 

Regular Contributor
Posts: 234

Re: Perl expression

@kiranv_ this makes sense. Thanks !

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 283 views
  • 4 likes
  • 5 in conversation