I need a solution to remove all characters to the left of a specific string. See the example below:
I have the following transaction codes and need a way to isolate the portion of the string beginning with "4122":
Input Transaction Codes:
601412265320
714122892335
507741225009964
Desired Output Codes:
412265320
4122892335
41225009964
The solution can use the following assumptions:
The SCAN Function doesn't seem to work when I tried a multi-character delimiter (It treated 4, 1, & 2 as separate delimiters). If I can use "4122" as delimiter and pull the portion of the Transaction Code to the right of "4122" I can just concatenate "4122" back on the string later on. However, if there is a solution that directly generates the desired output code including "4122" that would be preferred.
Thanks,
NW
Regular expressions provide an easy method to do this:
data have;
length code $40;
input code;
datalines;
601412265320
714122892335
507741225009964
;
data want;
length newCode $40;
set have;
newCode = prxchange("s/.*?(4122.*)/$1/", 1, code);
run;
proc print; run;
Use INDEX() to find the string
Use the result from INDEX along with the SUBSTR() function to selection the desired portion.
@NW_SAS wrote:
I need a solution to remove all characters to the left of a specific string. See the example below:
I have the following transaction codes and need a way to isolate the portion of the string beginning with "4122":
Input Transaction Codes:
601412265320
714122892335
507741225009964
Desired Output Codes:
412265320
4122892335
41225009964
The solution can use the following assumptions:
- I am using Version 7.12 of SAS Enterprise Guide
- The Transaction Codes are character variables
- Every Transaction Code will include the string "4122"
- The string "4122" will only appear once in each Transaction Code
- The number of characters on either side of the "4122" string can vary.
The SCAN Function doesn't seem to work when I tried a multi-character delimiter (It treated 4, 1, & 2 as separate delimiters). If I can use "4122" as delimiter and pull the portion of the Transaction Code to the right of "4122" I can just concatenate "4122" back on the string later on. However, if there is a solution that directly generates the desired output code including "4122" that would be preferred.
Thanks,
NW
Regular expressions provide an easy method to do this:
data have;
length code $40;
input code;
datalines;
601412265320
714122892335
507741225009964
;
data want;
length newCode $40;
set have;
newCode = prxchange("s/.*?(4122.*)/$1/", 1, code);
run;
proc print; run;
Another logic;
data have;
length code $40;
input code;
datalines;
601412265320
714122892335
507741225009964
;
data new (drop=i);
set have;
Array a{15} _temporary_;
do i=1 to 12;
a{i}=substr(code,i,4);
if a{i}=4122 then do;
new_var=substr(code,i);
End;
End;
run;
proc print data=new;
run;
@yashk check the log, you're mixing types up here - ie SUBSTR will return a character value, not a numeric and your array is set to be numeric as well.
Yes. you are correct, thank you for the check.
This worked perfectly. I don't have much experience with regular expressions though. Will that same PERL regular expression you use in the prxchange function work if the "4122" identifier were a different length? Or would the syntax need to be changed?
You can replace 4122 by any other sequence, as long as it doesn't involve special characters. Special characters would have to be escaped, eg "41.22" would appear in the pattern as "41\.22".
Great Thanks!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.