BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Luke3
Obsidian | Level 7

Hi,

 

function PRXCHANGE returns the error "The routine PRXCHANGE  was called using a regular expression that contains no replacement text" and I can't find the reason. What I want to do is extract the substring identified by the regular expression.

 

As a test, I get the same error on this example dataset:

 

data table1;
input name & $32.;
datalines;
44fds
3.3.fdfsd
22.22fdfs
12dsd12
;

 

data table2;
set table1;
name2 = prxchange('~^[0-9\./]+~', -1, name);
run; /*This gives the error*/

 

The regular expression is correct, in fact if you run this

 

data table2;
set table1;
test = PRXMATCH('~^[0-9\./]+~',name);
run;

 

it runs correctly and the variable test is valorized with 1, so the regex was found.

 

Thanks in advance,

Luke

1 ACCEPTED SOLUTION

Accepted Solutions
Luke3
Obsidian | Level 7

@andreas_lds wrote:

@Luke3 wrote:

It doesnt work if the string doesn't start with a number. From the example it wasn't clear, but it's not guaranteed that the string starts with a number, also it can have special characters, so I'm not sure the anyalpha-1 solution would work. In the meanwhile I found this way:


Then, please, post data that contains all possible combinations of digits and letters that could exist and the expected result.


More eterogeneous data:

-----
----string
helloworld
323.43astring
23hello(world*23.34.12)
1223/34anotherstring12.34
1234
12-43
13.34/34

The regular expression to extract is ^[0-9\./]+ (numbers, dots and slashes at the beginning). Expected result:

empty or skip
empty or skip
empty or skip
323.43
23
1223/34
1234
12
13.34/34

The solution I came up with is:

data table2(DROP = pattern start length);
set table1;
pattern = PRXPARSE('~^[0-9\./]+~');
call prxsubstr(pattern, name, start,length);
IF length>0 THEN name2 = SUBSTR(name, start , length);
run;

The solution proposed by @Ksharp should work if we add a check on the first character of the string:

data table2;
set table1;
IF ISNUMBER(SUBSTR(name,1,1) THEN name2 = prxchange('s/^([\d\.]+).*/\1/',1,name);
run;

 

Don't know if it's possible to do it with a single call to prxchange 

View solution in original post

12 REPLIES 12
Luke3
Obsidian | Level 7

I'm trying to extract the substring identified by the regular expression: numbers and dots starting from the beginning.

 

Variable name2 should contain:

44
3.3.
22.22
12

maguiremq
SAS Super FREQ

Do you have to use a regular expression? Are there other patterns that you need? `COMPRESS` can take care of this easily.

 

data table1;
input name & $32.;
datalines;
44fds
3.3.fdfsd
22.22fdfs
12dsd12
;

data table2;
	set table1;
		name2 = compress(name,,'kdp');
run;

maguiremq_0-1652788181677.png

 

Again, this only works for your example data. Other patterns may cause issues.

 

Edit: also noticed that you're not substituting anything in your `PRXCHANGE` call. You have to prepend your regex with a '/s' and provide a replacement, if I am remembering correctly.

 

Edit 2:

 

data table2;
	set table1;
		name2 = compress(name,,'kdp');
		name3 = prxchange('s/[A-Za-z+]//', -1, name);
run;

I'm not great with regex's and only use them when I have to.

maguiremq_0-1652788474015.png

 

Luke3
Obsidian | Level 7

I need numbers and dots starting from the beginning of the string, not all numbers and dots. So in line 4 it should be only 12, not 1212. Thanks.

andreas_lds
Jade | Level 19

For the data posted, this works, too:

data table1;
input name & $32.;
datalines;
44fds
3.3.fdfsd
22.22fdfs
12dsd12
;


data want;
   set table1;
   length numb $ 10;
   numb = substr(name, 1, anyalpha(name) -1);
run;
Ksharp
Super User
data table1;
input name & $32.;
datalines;
44fds
3.3.fdfsd
22.22fdfs
12dsd12
;

data table2;
	set table1;
test = prxchange('s/^([\d\.]+).*/\1/',1,name);
run;
Luke3
Obsidian | Level 7

It doesnt work if the string doesn't start with a number. From the example it wasn't clear, but it's not guaranteed that the string starts with a number, also it can have special characters, so I'm not sure the anyalpha-1 solution would work. In the meanwhile I found this way:

data table2(DROP = pattern start length);
set table1;
pattern = PRXPARSE('~^[0-9\./]+~');
call prxsubstr(pattern, name, start,length);
IF length>0 THEN name2 = SUBSTR(name, start , length);
run;

andreas_lds
Jade | Level 19

@Luke3 wrote:

It doesnt work if the string doesn't start with a number. From the example it wasn't clear, but it's not guaranteed that the string starts with a number, also it can have special characters, so I'm not sure the anyalpha-1 solution would work. In the meanwhile I found this way:


Then, please, post data that contains all possible combinations of digits and letters that could exist and the expected result.

Luke3
Obsidian | Level 7

@andreas_lds wrote:

@Luke3 wrote:

It doesnt work if the string doesn't start with a number. From the example it wasn't clear, but it's not guaranteed that the string starts with a number, also it can have special characters, so I'm not sure the anyalpha-1 solution would work. In the meanwhile I found this way:


Then, please, post data that contains all possible combinations of digits and letters that could exist and the expected result.


More eterogeneous data:

-----
----string
helloworld
323.43astring
23hello(world*23.34.12)
1223/34anotherstring12.34
1234
12-43
13.34/34

The regular expression to extract is ^[0-9\./]+ (numbers, dots and slashes at the beginning). Expected result:

empty or skip
empty or skip
empty or skip
323.43
23
1223/34
1234
12
13.34/34

The solution I came up with is:

data table2(DROP = pattern start length);
set table1;
pattern = PRXPARSE('~^[0-9\./]+~');
call prxsubstr(pattern, name, start,length);
IF length>0 THEN name2 = SUBSTR(name, start , length);
run;

The solution proposed by @Ksharp should work if we add a check on the first character of the string:

data table2;
set table1;
IF ISNUMBER(SUBSTR(name,1,1) THEN name2 = prxchange('s/^([\d\.]+).*/\1/',1,name);
run;

 

Don't know if it's possible to do it with a single call to prxchange 

LaneLi
SAS Employee

There is another  proposal.

 

data table1;
input name & $32.;
datalines;
44fds
3.3.fdfsd
22.22fdfs
12dsd12
;

data table2;
set table1;
name2 = prxchange('s/((^[0-9]+\.[0-9]+)|(^[0-9]+)).*/$1/', -1, name);
put name2=;
run;

s_lassen
Meteorite | Level 14

The problem seems to be that you call the PRXCHANGE routine, and although the PRX expression is syntactically correct, it is not a change expression. 

 

The syntax of a change expression is

<delimiter><expression to look for><delimiter><expression to replace with><delimiter><options>

In your example the delimiter is "~", and the expression to look for is "^[0-9\./]+", but nothing comes after the expression to look for.

 

If, for instance, you wanted to replace the found expression with an "X", your PRX expression should look like this:

'~^[0-9\./]+~X~'

 

What is the string you are searching for, and what do you want to replace it with?

 

Luke3
Obsidian | Level 7

@s_lassen wrote:

 

If, for instance, you wanted to replace the found expression with an "X", your PRX expression should look like this:

'~^[0-9\./]+~X~'

 

What is the string you are searching for, and what do you want to replace it with?

 


I want to extract ^[0-9\./]+ from the string, so like saying I want to replace the whole string with that.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1839 views
  • 0 likes
  • 6 in conversation