Solved: The routine PRXCHANGE was called using a regular expression that conta...

Luke3 · Posted 05-17-2022 06:48 AM

Hi,

function PRXCHANGE returns the error "The routine PRXCHANGE was called using a regular expression that contains no replacement text" and I can't find the reason. What I want to do is extract the substring identified by the regular expression.

As a test, I get the same error on this example dataset:

data table1;
input name & $32.;
datalines;
44fds
3.3.fdfsd
22.22fdfs
12dsd12
;

data table2;
set table1;
name2 = prxchange('~^[0-9\./]+~', -1, name);
run; /*This gives the error*/

The regular expression is correct, in fact if you run this

data table2;
set table1;
test = PRXMATCH('~^[0-9\./]+~',name);
run;

it runs correctly and the variable test is valorized with 1, so the regex was found.

Thanks in advance,

Luke

Luke3 · Posted 05-18-2022 06:23 AM

@andreas_lds wrote:

@Luke3 wrote:

It doesnt work if the string doesn't start with a number. From the example it wasn't clear, but it's not guaranteed that the string starts with a number, also it can have special characters, so I'm not sure the anyalpha-1 solution would work. In the meanwhile I found this way:

Then, please, post data that contains all possible combinations of digits and letters that could exist and the expected result.

More eterogeneous data:

-----
----string
helloworld
323.43astring
23hello(world*23.34.12)
1223/34anotherstring12.34
1234
12-43
13.34/34

The regular expression to extract is ^[0-9\./]+ (numbers, dots and slashes at the beginning). Expected result:

empty or skip
empty or skip
empty or skip
323.43
23
1223/34
1234
12
13.34/34

The solution I came up with is:

data table2(DROP = pattern start length);
set table1;
pattern = PRXPARSE('~^[0-9\./]+~');
call prxsubstr(pattern, name, start,length);
IF length>0 THEN name2 = SUBSTR(name, start , length);
run;

The solution proposed by @Ksharp should work if we add a check on the first character of the string:

data table2;
set table1;
IF ISNUMBER(SUBSTR(name,1,1) THEN name2 = prxchange('s/^([\d\.]+).*/\1/',1,name);
run;

Don't know if it's possible to do it with a single call to prxchange

View solution in original post

andreas_lds · Posted 05-17-2022 07:11 AM

What are you trying to achieve?

Luke3 · Posted 05-17-2022 07:17 AM

I'm trying to extract the substring identified by the regular expression: numbers and dots starting from the beginning.

Variable name2 should contain:

44
3.3.
22.22
12

maguiremq · Posted 05-17-2022 07:50 AM

Do you have to use a regular expression? Are there other patterns that you need? `COMPRESS` can take care of this easily.

data table1;
input name & $32.;
datalines;
44fds
3.3.fdfsd
22.22fdfs
12dsd12
;

data table2;
	set table1;
		name2 = compress(name,,'kdp');
run;

Again, this only works for your example data. Other patterns may cause issues.

Edit: also noticed that you're not substituting anything in your `PRXCHANGE` call. You have to prepend your regex with a '/s' and provide a replacement, if I am remembering correctly.

Edit 2:

data table2;
	set table1;
		name2 = compress(name,,'kdp');
		name3 = prxchange('s/[A-Za-z+]//', -1, name);
run;

I'm not great with regex's and only use them when I have to.

Luke3 · Posted 05-17-2022 08:05 AM

I need numbers and dots starting from the beginning of the string, not all numbers and dots. So in line 4 it should be only 12, not 1212. Thanks.

andreas_lds · Posted 05-17-2022 09:24 AM

For the data posted, this works, too:

data table1;
input name & $32.;
datalines;
44fds
3.3.fdfsd
22.22fdfs
12dsd12
;


data want;
   set table1;
   length numb $ 10;
   numb = substr(name, 1, anyalpha(name) -1);
run;

Ksharp · Posted 05-17-2022 08:36 AM

data table1;
input name & $32.;
datalines;
44fds
3.3.fdfsd
22.22fdfs
12dsd12
;

data table2;
	set table1;
test = prxchange('s/^([\d\.]+).*/\1/',1,name);
run;

Luke3 · Posted 05-17-2022 09:23 AM

It doesnt work if the string doesn't start with a number. From the example it wasn't clear, but it's not guaranteed that the string starts with a number, also it can have special characters, so I'm not sure the anyalpha-1 solution would work. In the meanwhile I found this way:

data table2(DROP = pattern start length);
set table1;
pattern = PRXPARSE('~^[0-9\./]+~');
call prxsubstr(pattern, name, start,length);
IF length>0 THEN name2 = SUBSTR(name, start , length);
run;

andreas_lds · Posted 05-18-2022 01:33 AM

@Luke3 wrote:

It doesnt work if the string doesn't start with a number. From the example it wasn't clear, but it's not guaranteed that the string starts with a number, also it can have special characters, so I'm not sure the anyalpha-1 solution would work. In the meanwhile I found this way:

Then, please, post data that contains all possible combinations of digits and letters that could exist and the expected result.

Luke3 · Posted 05-18-2022 06:23 AM

@andreas_lds wrote:

@Luke3 wrote:

It doesnt work if the string doesn't start with a number. From the example it wasn't clear, but it's not guaranteed that the string starts with a number, also it can have special characters, so I'm not sure the anyalpha-1 solution would work. In the meanwhile I found this way:

Then, please, post data that contains all possible combinations of digits and letters that could exist and the expected result.

More eterogeneous data:

-----
----string
helloworld
323.43astring
23hello(world*23.34.12)
1223/34anotherstring12.34
1234
12-43
13.34/34

The regular expression to extract is ^[0-9\./]+ (numbers, dots and slashes at the beginning). Expected result:

empty or skip
empty or skip
empty or skip
323.43
23
1223/34
1234
12
13.34/34

The solution I came up with is:

data table2(DROP = pattern start length);
set table1;
pattern = PRXPARSE('~^[0-9\./]+~');
call prxsubstr(pattern, name, start,length);
IF length>0 THEN name2 = SUBSTR(name, start , length);
run;

The solution proposed by @Ksharp should work if we add a check on the first character of the string:

data table2;
set table1;
IF ISNUMBER(SUBSTR(name,1,1) THEN name2 = prxchange('s/^([\d\.]+).*/\1/',1,name);
run;

Don't know if it's possible to do it with a single call to prxchange

LaneLi · Posted 05-22-2022 10:35 PM

There is another proposal.

data table1;
input name & $32.;
datalines;
44fds
3.3.fdfsd
22.22fdfs
12dsd12
;

data table2;
set table1;
name2 = prxchange('s/((^[0-9]+\.[0-9]+)|(^[0-9]+)).*/$1/', -1, name);
put name2=;
run;

s_lassen · Posted 05-17-2022 12:05 PM

The problem seems to be that you call the PRXCHANGE routine, and although the PRX expression is syntactically correct, it is not a change expression.

The syntax of a change expression is

<delimiter><expression to look for><delimiter><expression to replace with><delimiter><options>

In your example the delimiter is "~", and the expression to look for is "^[0-9\./]+", but nothing comes after the expression to look for.

If, for instance, you wanted to replace the found expression with an "X", your PRX expression should look like this:

'~^[0-9\./]+~X~'

What is the string you are searching for, and what do you want to replace it with?

Luke3 · Posted 05-18-2022 06:27 AM

@s_lassen wrote:

If, for instance, you wanted to replace the found expression with an "X", your PRX expression should look like this:
'~^[0-9\./]+~X~'
What is the string you are searching for, and what do you want to replace it with?

I want to extract ^[0-9\./]+ from the string, so like saying I want to replace the whole string with that.

The routine PRXCHANGE was called using a regular expression that contains no replacement text

Re: The routine PRXCHANGE was called using a regular expression that contains no replacement text

Re: The routine PRXCHANGE was called using a regular expression that contains no replacement text

Re: The routine PRXCHANGE was called using a regular expression that contains no replacement text

Re: The routine PRXCHANGE was called using a regular expression that contains no replacement text

Re: The routine PRXCHANGE was called using a regular expression that contains no replacement text

Re: The routine PRXCHANGE was called using a regular expression that contains no replacement text

Re: The routine PRXCHANGE was called using a regular expression that contains no replacement text

Re: The routine PRXCHANGE was called using a regular expression that contains no replacement text

Re: The routine PRXCHANGE was called using a regular expression that contains no replacement text

Re: The routine PRXCHANGE was called using a regular expression that contains no replacement text

Re: The routine PRXCHANGE was called using a regular expression that contains no replacement text

Re: The routine PRXCHANGE was called using a regular expression that contains no replacement text

Re: The routine PRXCHANGE was called using a regular expression that contains no replacement text

Classroom Training Available!