BookmarkSubscribeRSS Feed
buckeyefisher
Obsidian | Level 7

Hi,

I want to extract specific digits from text. I am using following code.

" if prxmatch("m/0.21|0.20|0.19/0.18|0.17/oi",fromtext)>0 then ..."

I have also tried other options such as \w, \d+,  \d{2,4} but not working.

Any suggestions?

Thanks,

11 REPLIES 11
Haikuo
Onyx | Level 15

It would be helpful if you could show some INs and OUTs.

Haikuo

PGStats
Opal | Level 21

It is difficult to guess what you are trying to match. Post strings that shouild match, and others that shouldn't.

A thing that looks suspicious in your pattern, are the unescaped periods. Could that be the problem?

PG

PG
buckeyefisher
Obsidian | Level 7

for example 19 should not match but .19 or 0.19 should match.

I didn't understand unescaped periods?

PGStats
Opal | Level 21

The period (.) is a special character in patterns. It matches any single character except newline. If you want to match a period you must use "\,"

Try "/0?\.\d{2,4}/o"

PG

PG
buckeyefisher
Obsidian | Level 7

Thanks PG

So shall it look like

" if prxmatch("m/0.21|0.20|0.19|0.18|0.17/0?\.\d{2,4}/o",text)>0 then "

PGStats
Opal | Level 21

First I'd need to know if the following should match:

0.55

3.21

-0.19

0.2112345

10.21

-.21

PG
buckeyefisher
Obsidian | Level 7

none of the should match. It should only look for exact match for 0.21 or 0.19. The text does not have any sign so I am less worried about -0.19.

PGStats
Opal | Level 21

OK. In that case, the best pattern I can come up with is:

prxmatch("/(?<!\d)0?\.(19|21)(?!\d)/o", text)

It reads: something that isn't a digit, followed optionally by a zero, followed by a period and 19 or 21, followed by something that isn’t a digit.

The value returned will be the position of the zero or of the period character in the string. Does that correspond with what you're looking for?

PG

PG
art297
Opal | Level 21

While PG's suggested code will do what you asked for (as long as there are no leading spaces or other unanticipated exceptions), wouldn't using an input function be more understandable and precise?  e.g., compare:

data have;

  input text $char10.;

  cards;

0.55

3.21

-0.19

0.1

.170

.17

0.19

A0.19

.21

0.2112345

10.21

-.21

;

data want;

  set have;

  if input(text, ?? 12.) in (.21,.20,.19,.18,.17) then itest=1;

  if prxmatch("/(?<!\d)0?\.(17|18|19|20|21)(?!\d)/o", text) eq 1 then ptest=1;

run;

PGStats
Opal | Level 21

, PRXMATCH returns the starting position of the match, which can be useful too. So it would be more appropriate to compare the return value with zero: PRXMATCH(...) > 0 .

PG

PG
buckeyefisher
Obsidian | Level 7

you guys are amazing....!!

I tried code given by PG and its working perfectly.

Thanks !!!

Kiran

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1394 views
  • 3 likes
  • 4 in conversation