I'm trying to use RegEx to extract/transform a string column. I want to eliminate anything enclosed by a pipe character. the regular expression below was tested in an editor regexr.com as shown:
When I code this in SAS, I get no result in String2.
data input_data;
Length string $100. desiredresult $100.;
Infile datalines delimiter='#';
input string $ desiredResult $;
datalines;
match all of this|not this|match|no|yes|n|y#match all of this match yes y
;
run;
data input_data;
SET WORK.input_data;
string2 = PRXCHANGE('[^| ]*(?=(?:[^|]*\|[^|]*\|)*[^|]*$)',-1,string);
;
run;
I'm not sure what I'm doing wrong here? seems simple enough. any ideas?
Let's start with the subject line: PRXCHANGE to extract text outside of <> chars Regex not working [closed]
None of your code or example text shows and < or > so I am not sure how your subject line actually relates to your shown code, or example pictures.
It isn't even clear to me which is the desired resulting value of the variable sting2. Or the actual rule that may be involved.
@SASAlex101 wrote:
I'm trying to use RegEx to extract/transform a string column. I want to eliminate anything enclosed by a pipe character. the regular expression below was tested in an editor regexr.com as shown:
When I code this in SAS, I get no result in String2.
data input_data; Length string $100. desiredresult $100.; Infile datalines delimiter='#'; input string $ desiredResult $; datalines; match all of this|not this|match|no|yes|n|y#match all of this match yes y ; run; data input_data; SET WORK.input_data; string2 = PRXCHANGE('[^| ]*(?=(?:[^|]*\|[^|]*\|)*[^|]*$)',-1,string); ; run;
I'm not sure what I'm doing wrong here? seems simple enough. any ideas?
Here you go:
want=prxchange('s/(\|[^|]*\|)/ /',-1,string);
- Cheers -
Since the REGEX you tried is clearly not valid for SAS:
520 data input_data; 521 SET WORK.input_data; 522 string2 = PRXCHANGE('[^| ]*(?=(?:[^|]*\|[^|]*\|)*[^|]*$)',-1,string); 523 ; 524 run; ERROR: Invalid characters "*(?=(?:[^|]*\|[^|]*\|)*[^|]*$)" after end delimiter "]" of regular expression "[^| ]*(?=(?:[^|]*\|[^|]*\|)*[^|]*$)". ERROR: The regular expression passed to the function PRXCHANGE contains a syntax error. NOTE: Argument 1 to function PRXCHANGE('[^| ]*(?=(?:'[12 of 35 characters shown],-1,'match all of'[12 of 100 characters shown]) at line 522 column 11 is invalid. string=match all of this|not this|match|no|yes|n|y desiredresult=match all of this match yes y string2= _ERROR_=1 _N_=1 NOTE: There were 1 observations read from the data set WORK.INPUT_DATA. NOTE: The data set WORK.INPUT_DATA has 1 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds
Perhaps you should EXPLAIN what the rule is from transforming
match all of this|not this|match|no|yes|n|y
into
match all of this match yes y
You seem to have replaced | with spaces.
But why did you remove "not this"?
data input_data;
Length string $100. desiredresult $100.;
Infile datalines delimiter='#';
input string $ desiredResult $;
datalines;
match all of this|not this|match|no|yes|n|y#match all of this match yes y
;
run;
data input_data;
SET WORK.input_data;
string2 = PRXCHANGE('s/\|.*?\|/ /',-1,string);
;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.