BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Luke3
Obsidian | Level 7

Hi,

 

function PRXCHANGE returns the error "The routine PRXCHANGE  was called using a regular expression that contains no replacement text" and I can't find the reason. What I want to do is extract the substring identified by the regular expression.

 

As a test, I get the same error on this example dataset:

 

data table1;
input name & $32.;
datalines;
44fds
3.3.fdfsd
22.22fdfs
12dsd12
;

 

data table2;
set table1;
name2 = prxchange('~^[0-9\./]+~', -1, name);
run; /*This gives the error*/

 

The regular expression is correct, in fact if you run this

 

data table2;
set table1;
test = PRXMATCH('~^[0-9\./]+~',name);
run;

 

it runs correctly and the variable test is valorized with 1, so the regex was found.

 

Thanks in advance,

Luke

1 ACCEPTED SOLUTION

Accepted Solutions
Luke3
Obsidian | Level 7

@andreas_lds wrote:

@Luke3 wrote:

It doesnt work if the string doesn't start with a number. From the example it wasn't clear, but it's not guaranteed that the string starts with a number, also it can have special characters, so I'm not sure the anyalpha-1 solution would work. In the meanwhile I found this way:


Then, please, post data that contains all possible combinations of digits and letters that could exist and the expected result.


More eterogeneous data:

-----
----string
helloworld
323.43astring
23hello(world*23.34.12)
1223/34anotherstring12.34
1234
12-43
13.34/34

The regular expression to extract is ^[0-9\./]+ (numbers, dots and slashes at the beginning). Expected result:

empty or skip
empty or skip
empty or skip
323.43
23
1223/34
1234
12
13.34/34

The solution I came up with is:

data table2(DROP = pattern start length);
set table1;
pattern = PRXPARSE('~^[0-9\./]+~');
call prxsubstr(pattern, name, start,length);
IF length>0 THEN name2 = SUBSTR(name, start , length);
run;

The solution proposed by @Ksharp should work if we add a check on the first character of the string:

data table2;
set table1;
IF ISNUMBER(SUBSTR(name,1,1) THEN name2 = prxchange('s/^([\d\.]+).*/\1/',1,name);
run;

 

Don't know if it's possible to do it with a single call to prxchange 

View solution in original post

12 REPLIES 12
Luke3
Obsidian | Level 7

I'm trying to extract the substring identified by the regular expression: numbers and dots starting from the beginning.

 

Variable name2 should contain:

44
3.3.
22.22
12

maguiremq
SAS Super FREQ

Do you have to use a regular expression? Are there other patterns that you need? `COMPRESS` can take care of this easily.

 

data table1;
input name & $32.;
datalines;
44fds
3.3.fdfsd
22.22fdfs
12dsd12
;

data table2;
	set table1;
		name2 = compress(name,,'kdp');
run;

maguiremq_0-1652788181677.png

 

Again, this only works for your example data. Other patterns may cause issues.

 

Edit: also noticed that you're not substituting anything in your `PRXCHANGE` call. You have to prepend your regex with a '/s' and provide a replacement, if I am remembering correctly.

 

Edit 2:

 

data table2;
	set table1;
		name2 = compress(name,,'kdp');
		name3 = prxchange('s/[A-Za-z+]//', -1, name);
run;

I'm not great with regex's and only use them when I have to.

maguiremq_0-1652788474015.png

 

Luke3
Obsidian | Level 7

I need numbers and dots starting from the beginning of the string, not all numbers and dots. So in line 4 it should be only 12, not 1212. Thanks.

andreas_lds
Jade | Level 19

For the data posted, this works, too:

data table1;
input name & $32.;
datalines;
44fds
3.3.fdfsd
22.22fdfs
12dsd12
;


data want;
   set table1;
   length numb $ 10;
   numb = substr(name, 1, anyalpha(name) -1);
run;
Ksharp
Super User
data table1;
input name & $32.;
datalines;
44fds
3.3.fdfsd
22.22fdfs
12dsd12
;

data table2;
	set table1;
test = prxchange('s/^([\d\.]+).*/\1/',1,name);
run;
Luke3
Obsidian | Level 7

It doesnt work if the string doesn't start with a number. From the example it wasn't clear, but it's not guaranteed that the string starts with a number, also it can have special characters, so I'm not sure the anyalpha-1 solution would work. In the meanwhile I found this way:

data table2(DROP = pattern start length);
set table1;
pattern = PRXPARSE('~^[0-9\./]+~');
call prxsubstr(pattern, name, start,length);
IF length>0 THEN name2 = SUBSTR(name, start , length);
run;

andreas_lds
Jade | Level 19

@Luke3 wrote:

It doesnt work if the string doesn't start with a number. From the example it wasn't clear, but it's not guaranteed that the string starts with a number, also it can have special characters, so I'm not sure the anyalpha-1 solution would work. In the meanwhile I found this way:


Then, please, post data that contains all possible combinations of digits and letters that could exist and the expected result.

Luke3
Obsidian | Level 7

@andreas_lds wrote:

@Luke3 wrote:

It doesnt work if the string doesn't start with a number. From the example it wasn't clear, but it's not guaranteed that the string starts with a number, also it can have special characters, so I'm not sure the anyalpha-1 solution would work. In the meanwhile I found this way:


Then, please, post data that contains all possible combinations of digits and letters that could exist and the expected result.


More eterogeneous data:

-----
----string
helloworld
323.43astring
23hello(world*23.34.12)
1223/34anotherstring12.34
1234
12-43
13.34/34

The regular expression to extract is ^[0-9\./]+ (numbers, dots and slashes at the beginning). Expected result:

empty or skip
empty or skip
empty or skip
323.43
23
1223/34
1234
12
13.34/34

The solution I came up with is:

data table2(DROP = pattern start length);
set table1;
pattern = PRXPARSE('~^[0-9\./]+~');
call prxsubstr(pattern, name, start,length);
IF length>0 THEN name2 = SUBSTR(name, start , length);
run;

The solution proposed by @Ksharp should work if we add a check on the first character of the string:

data table2;
set table1;
IF ISNUMBER(SUBSTR(name,1,1) THEN name2 = prxchange('s/^([\d\.]+).*/\1/',1,name);
run;

 

Don't know if it's possible to do it with a single call to prxchange 

LaneLi
SAS Employee

There is another  proposal.

 

data table1;
input name & $32.;
datalines;
44fds
3.3.fdfsd
22.22fdfs
12dsd12
;

data table2;
set table1;
name2 = prxchange('s/((^[0-9]+\.[0-9]+)|(^[0-9]+)).*/$1/', -1, name);
put name2=;
run;

s_lassen
Meteorite | Level 14

The problem seems to be that you call the PRXCHANGE routine, and although the PRX expression is syntactically correct, it is not a change expression. 

 

The syntax of a change expression is

<delimiter><expression to look for><delimiter><expression to replace with><delimiter><options>

In your example the delimiter is "~", and the expression to look for is "^[0-9\./]+", but nothing comes after the expression to look for.

 

If, for instance, you wanted to replace the found expression with an "X", your PRX expression should look like this:

'~^[0-9\./]+~X~'

 

What is the string you are searching for, and what do you want to replace it with?

 

Luke3
Obsidian | Level 7

@s_lassen wrote:

 

If, for instance, you wanted to replace the found expression with an "X", your PRX expression should look like this:

'~^[0-9\./]+~X~'

 

What is the string you are searching for, and what do you want to replace it with?

 


I want to extract ^[0-9\./]+ from the string, so like saying I want to replace the whole string with that.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 4005 views
  • 0 likes
  • 6 in conversation