BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
NW_SAS
Fluorite | Level 6

I need a solution to remove all characters to the left of a specific string. See the example below:

 

I have the following transaction codes and need a way to isolate the portion of the string beginning with "4122":

 

Input Transaction Codes:

601412265320

714122892335

507741225009964

 

Desired Output Codes:

412265320

4122892335

41225009964

 

The solution can use the following assumptions:

  • I am using Version 7.12 of SAS Enterprise Guide
  • The Transaction Codes are character variables
  • Every Transaction Code will include the string "4122"
  • The string "4122" will only appear once in each Transaction Code
  • The number of characters on either side of the "4122" string can vary. 

The SCAN Function doesn't seem to work when I tried a multi-character delimiter (It treated 4, 1, & 2 as separate delimiters). If I can use "4122" as delimiter and pull the portion of the Transaction Code to the right of "4122" I can just concatenate "4122" back on the string later on. However, if there is a solution that directly generates the desired output code including "4122" that would be preferred.

 

Thanks,

NW

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Regular expressions provide an easy method to do this:

 

data have;
length code $40;
input code;
datalines;
601412265320
714122892335
507741225009964
;

data want;
length newCode $40;
set have;
newCode = prxchange("s/.*?(4122.*)/$1/", 1, code);
run;

proc print; run;

 

 

PG

View solution in original post

8 REPLIES 8
Reeza
Super User

Use INDEX() to find the string

Use the result from INDEX along with the SUBSTR() function to selection the desired portion.

 


@NW_SAS wrote:

I need a solution to remove all characters to the left of a specific string. See the example below:

 

I have the following transaction codes and need a way to isolate the portion of the string beginning with "4122":

 

Input Transaction Codes:

601412265320

714122892335

507741225009964

 

Desired Output Codes:

412265320

4122892335

41225009964

 

The solution can use the following assumptions:

  • I am using Version 7.12 of SAS Enterprise Guide
  • The Transaction Codes are character variables
  • Every Transaction Code will include the string "4122"
  • The string "4122" will only appear once in each Transaction Code
  • The number of characters on either side of the "4122" string can vary. 

The SCAN Function doesn't seem to work when I tried a multi-character delimiter (It treated 4, 1, & 2 as separate delimiters). If I can use "4122" as delimiter and pull the portion of the Transaction Code to the right of "4122" I can just concatenate "4122" back on the string later on. However, if there is a solution that directly generates the desired output code including "4122" that would be preferred.

 

Thanks,

NW


 

PGStats
Opal | Level 21

Regular expressions provide an easy method to do this:

 

data have;
length code $40;
input code;
datalines;
601412265320
714122892335
507741225009964
;

data want;
length newCode $40;
set have;
newCode = prxchange("s/.*?(4122.*)/$1/", 1, code);
run;

proc print; run;

 

 

PG
yashk
Fluorite | Level 6

Another logic;

 

 

data have;
length code $40;
input code;
datalines;
601412265320
714122892335
507741225009964
;

data new (drop=i);
set have;
Array a{15} _temporary_;
do i=1 to 12;
a{i}=substr(code,i,4);
if a{i}=4122 then do;
new_var=substr(code,i);
End;
End;
run;

proc print data=new;
run;

Reeza
Super User

@yashk check the log, you're mixing types up here - ie SUBSTR will return a character value, not a numeric and your array is set to be numeric as well. 

 

 

yashk
Fluorite | Level 6

Yes. you are correct, thank you for the check.

 

Capture.PNG

NW_SAS
Fluorite | Level 6

This worked perfectly. I don't have much experience with regular expressions though. Will that same PERL regular expression you use in the prxchange function work if the "4122" identifier were a different length? Or would the syntax need to be changed?

PGStats
Opal | Level 21

You can replace 4122 by any other sequence, as long as it doesn't involve special characters. Special characters would have to be escaped, eg "41.22" would appear in the pattern as "41\.22".

PG
NW_SAS
Fluorite | Level 6

Great Thanks!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 8318 views
  • 7 likes
  • 4 in conversation