DATA Step, Macro, Functions and more

Help in substring function

Accepted Solution Solved
Reply
Contributor
Posts: 70
Accepted Solution

Help in substring function

[ Edited ]

data want;
set have (firstobs=1 obs=10);
id1=SUBSTRN(id,13,10);
RUN;

 

given data:

 

ABC00000012365948952321
ABC00000000000948952321
ABC00000000025369214558

 

Output want: 

12365948952321

 948952321

25369214558

 

 

I have set of ID's that consist of 23 digits.
First 3 letters starts with alphabets and the rest are numbers.
I want to pull only the last numbers that starts after zero.
Since it's not constant position not sure how to use the function?


Accepted Solutions
Solution
‎08-02-2017 03:33 PM
PROC Star
Posts: 325

Re: Help in substring function

[ Edited ]
Posted in reply to Kalai2008

Something like this. What we are doing is making three groups. Anything mentioned in parentheis is group.1 st group is alphabets 2nd group is anything with 0 i.e. in midddle third group is from 0-9. We are replacing everything with last group that is third group mentioned as $3

data abc;
input var $50.;
datalines;
ABC00000012365948952321
ABC00000000000948952321
ABC00000000025369214558
ABC000000000253692145580
;


data bde;
set abc;
var1 =prxchange('s/([A-Z]+)(0+)([0-9]+)/$3/i', -1, var);
run;

View solution in original post


All Replies
Super User
Posts: 19,770

Re: Help in substring function

Posted in reply to Kalai2008

Use COMPRESS to remove any characters that are letters.

Use INPUT to read the remaining text as a number.

 

 

PROC Star
Posts: 283

Re: Help in substring function

[ Edited ]
Posted in reply to Kalai2008

data have;
input var $25.;
datalines;
ABC00000012365948952321
ABC00000000000948952321
ABC00000000025369214558
;

 

data want;
set have;
new_var=compress(var,'abcdefghijklmnopqrstuvwxyz0');
run;

 

Regards,

Naveen Srinivasan

Super User
Posts: 19,770

Re: Help in substring function

Posted in reply to novinosrin

@novinosrin It's not shown in the OP's example, but what if there was a 0 in the middle of the number? I think it's highly likely that's a possibility. 

 

ABC00000012365948052321
ABC00000000000948952321
ABC00000000025369214008

Contributor
Posts: 70

Re: Help in substring function

Thank you, I tried the compress function, but as you said its eliminating the 0 .

 

example:

 

ABC000000000025225250

ABC000000000025205250

 

 its eliminating the last zero and middle zero  as well.

Super User
Posts: 19,770

Re: Help in substring function

Posted in reply to Kalai2008

Kalai2008 wrote:

Thank you, I tried the compress function, but as you said its eliminating the 0 .

 

example:

 

ABC000000000025225250

ABC000000000025205250

 

 its eliminating the last zero and middle zero  as well.


POST YOUR CODE. You can specify digits or characters using the 3rd or 4th parameter of the COMPRESS function, you don't have to list them at all. 

Contributor
Posts: 70

Re: Help in substring function

data chk;
set test1.CHK2(firstobs=1 obs=10);
acct=compress(acct_id,'ABC0');
RUN;

 

 

DATA:

ABC000000000014256320

ABC000000000140252301

 

Output Results :

1425632

1425231

 

Solution
‎08-02-2017 03:33 PM
PROC Star
Posts: 325

Re: Help in substring function

[ Edited ]
Posted in reply to Kalai2008

Something like this. What we are doing is making three groups. Anything mentioned in parentheis is group.1 st group is alphabets 2nd group is anything with 0 i.e. in midddle third group is from 0-9. We are replacing everything with last group that is third group mentioned as $3

data abc;
input var $50.;
datalines;
ABC00000012365948952321
ABC00000000000948952321
ABC00000000025369214558
ABC000000000253692145580
;


data bde;
set abc;
var1 =prxchange('s/([A-Z]+)(0+)([0-9]+)/$3/i', -1, var);
run;

Contributor
Posts: 70

Re: Help in substring function

[ Edited ]

Wow..It worked..Thanks a lot! Saved my time! Good & clear  Explanation.

Super User
Posts: 19,770

Re: Help in substring function

Posted in reply to Kalai2008

PRX is a great solution, but which one of these can you understand and modify if necessary?

 

data want;
set abc;

*Keep only digits in the string;
x=compress(var, , 'kd');

*convert to numeric;
y=input(x, best32.);

*use substring to extract all text after 3/4th character and convert to number;
z=input(substr(var, 4), best32.);

*PRX formula;
w=prxchange('s/([A-Z]+)(0+)([0-9]+)/$3/i', -1, var);

format  y z  best32.;
format x w $32.;

run;
Contributor
Posts: 70

Re: Help in substring function

Thank you Reeza. Let me try everything
Contributor
Posts: 70

Re: Help in substring function

Somehow the code didn't picked this id,
It excluded the last zero.
ABCD0000000011423744160
ABCD0000000011129735160..

Anyways Thank you Reeza, will try to modify as per your solution.
Super User
Posts: 19,770

Re: Help in substring function

Posted in reply to Kalai2008

Kalai2008 wrote:
Somehow the code didn't picked this id,
It excluded the last zero.
ABCD0000000011423744160
ABCD0000000011129735160..

Anyways Thank you Reeza, will try to modify as per your solution.

Note that I don't understand PRX, so the function used above is as the original poster posted.  It's a perfectly valid solution, just not easy to modify especially as a beginner. 

Contributor
Posts: 70

Re: Help in substring function

Posted in reply to Kalai2008

I am so sorry, the code worked . I did a mistake in the input function ( char to num). Thank you!

Super User
Posts: 10,020

Re: Help in substring function

Posted in reply to Kalai2008
data abc;
input var $50.;
datalines;
ABC00000012365948952321
ABC00000000000948952321
ABC00000000025369214558
ABC000000000253692145580
;


data bde;
set abc;
var1 =prxchange('s/^[A-Z0]+//i', -1, var);
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 15 replies
  • 338 views
  • 10 likes
  • 6 in conversation