Help using Base SAS procedures

PRXMATCH to extract digits

Reply
Contributor
Posts: 63

PRXMATCH to extract digits

Hi,

I want to extract specific digits from text. I am using following code.

" if prxmatch("m/0.21|0.20|0.19/0.18|0.17/oi",fromtext)>0 then ..."

I have also tried other options such as \w, \d+,  \d{2,4} but not working.

Any suggestions?

Thanks,

Respected Advisor
Posts: 3,124

Re: PRXMATCH to extract digits

It would be helpful if you could show some INs and OUTs.

Haikuo

Respected Advisor
Posts: 4,641

Re: PRXMATCH to extract digits

It is difficult to guess what you are trying to match. Post strings that shouild match, and others that shouldn't.

A thing that looks suspicious in your pattern, are the unescaped periods. Could that be the problem?

PG

PG
Contributor
Posts: 63

Re: PRXMATCH to extract digits

for example 19 should not match but .19 or 0.19 should match.

I didn't understand unescaped periods?

Respected Advisor
Posts: 4,641

Re: PRXMATCH to extract digits

The period (.) is a special character in patterns. It matches any single character except newline. If you want to match a period you must use "\,"

Try "/0?\.\d{2,4}/o"

PG

PG
Contributor
Posts: 63

Re: PRXMATCH to extract digits

Thanks PG

So shall it look like

" if prxmatch("m/0.21|0.20|0.19|0.18|0.17/0?\.\d{2,4}/o",text)>0 then "

Respected Advisor
Posts: 4,641

Re: PRXMATCH to extract digits

First I'd need to know if the following should match:

0.55

3.21

-0.19

0.2112345

10.21

-.21

PG
Contributor
Posts: 63

Re: PRXMATCH to extract digits

none of the should match. It should only look for exact match for 0.21 or 0.19. The text does not have any sign so I am less worried about -0.19.

Respected Advisor
Posts: 4,641

Re: PRXMATCH to extract digits

OK. In that case, the best pattern I can come up with is:

prxmatch("/(?<!\d)0?\.(19|21)(?!\d)/o", text)

It reads: something that isn't a digit, followed optionally by a zero, followed by a period and 19 or 21, followed by something that isn’t a digit.

The value returned will be the position of the zero or of the period character in the string. Does that correspond with what you're looking for?

PG

PG
PROC Star
Posts: 7,356

Re: PRXMATCH to extract digits

While PG's suggested code will do what you asked for (as long as there are no leading spaces or other unanticipated exceptions), wouldn't using an input function be more understandable and precise?  e.g., compare:

data have;

  input text $char10.;

  cards;

0.55

3.21

-0.19

0.1

.170

.17

0.19

A0.19

.21

0.2112345

10.21

-.21

;

data want;

  set have;

  if input(text, ?? 12.) in (.21,.20,.19,.18,.17) then itest=1;

  if prxmatch("/(?<!\d)0?\.(17|18|19|20|21)(?!\d)/o", text) eq 1 then ptest=1;

run;

Respected Advisor
Posts: 4,641

Re: PRXMATCH to extract digits

, PRXMATCH returns the starting position of the match, which can be useful too. So it would be more appropriate to compare the return value with zero: PRXMATCH(...) > 0 .

PG

PG
Contributor
Posts: 63

Re: PRXMATCH to extract digits

you guys are amazing....!!

I tried code given by PG and its working perfectly.

Thanks !!!

Kiran

Ask a Question
Discussion stats
  • 11 replies
  • 480 views
  • 3 likes
  • 4 in conversation