BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
NW_SAS
Fluorite | Level 6

I need a solution to remove all characters to the left of a specific string. See the example below:

 

I have the following transaction codes and need a way to isolate the portion of the string beginning with "4122":

 

Input Transaction Codes:

601412265320

714122892335

507741225009964

 

Desired Output Codes:

412265320

4122892335

41225009964

 

The solution can use the following assumptions:

  • I am using Version 7.12 of SAS Enterprise Guide
  • The Transaction Codes are character variables
  • Every Transaction Code will include the string "4122"
  • The string "4122" will only appear once in each Transaction Code
  • The number of characters on either side of the "4122" string can vary. 

The SCAN Function doesn't seem to work when I tried a multi-character delimiter (It treated 4, 1, & 2 as separate delimiters). If I can use "4122" as delimiter and pull the portion of the Transaction Code to the right of "4122" I can just concatenate "4122" back on the string later on. However, if there is a solution that directly generates the desired output code including "4122" that would be preferred.

 

Thanks,

NW

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Regular expressions provide an easy method to do this:

 

data have;
length code $40;
input code;
datalines;
601412265320
714122892335
507741225009964
;

data want;
length newCode $40;
set have;
newCode = prxchange("s/.*?(4122.*)/$1/", 1, code);
run;

proc print; run;

 

 

PG

View solution in original post

8 REPLIES 8
Reeza
Super User

Use INDEX() to find the string

Use the result from INDEX along with the SUBSTR() function to selection the desired portion.

 


@NW_SAS wrote:

I need a solution to remove all characters to the left of a specific string. See the example below:

 

I have the following transaction codes and need a way to isolate the portion of the string beginning with "4122":

 

Input Transaction Codes:

601412265320

714122892335

507741225009964

 

Desired Output Codes:

412265320

4122892335

41225009964

 

The solution can use the following assumptions:

  • I am using Version 7.12 of SAS Enterprise Guide
  • The Transaction Codes are character variables
  • Every Transaction Code will include the string "4122"
  • The string "4122" will only appear once in each Transaction Code
  • The number of characters on either side of the "4122" string can vary. 

The SCAN Function doesn't seem to work when I tried a multi-character delimiter (It treated 4, 1, & 2 as separate delimiters). If I can use "4122" as delimiter and pull the portion of the Transaction Code to the right of "4122" I can just concatenate "4122" back on the string later on. However, if there is a solution that directly generates the desired output code including "4122" that would be preferred.

 

Thanks,

NW


 

PGStats
Opal | Level 21

Regular expressions provide an easy method to do this:

 

data have;
length code $40;
input code;
datalines;
601412265320
714122892335
507741225009964
;

data want;
length newCode $40;
set have;
newCode = prxchange("s/.*?(4122.*)/$1/", 1, code);
run;

proc print; run;

 

 

PG
yashk
Fluorite | Level 6

Another logic;

 

 

data have;
length code $40;
input code;
datalines;
601412265320
714122892335
507741225009964
;

data new (drop=i);
set have;
Array a{15} _temporary_;
do i=1 to 12;
a{i}=substr(code,i,4);
if a{i}=4122 then do;
new_var=substr(code,i);
End;
End;
run;

proc print data=new;
run;

Reeza
Super User

@yashk check the log, you're mixing types up here - ie SUBSTR will return a character value, not a numeric and your array is set to be numeric as well. 

 

 

yashk
Fluorite | Level 6

Yes. you are correct, thank you for the check.

 

Capture.PNG

NW_SAS
Fluorite | Level 6

This worked perfectly. I don't have much experience with regular expressions though. Will that same PERL regular expression you use in the prxchange function work if the "4122" identifier were a different length? Or would the syntax need to be changed?

PGStats
Opal | Level 21

You can replace 4122 by any other sequence, as long as it doesn't involve special characters. Special characters would have to be escaped, eg "41.22" would appear in the pattern as "41\.22".

PG
NW_SAS
Fluorite | Level 6

Great Thanks!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 7940 views
  • 7 likes
  • 4 in conversation