BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kalai2008
Pyrite | Level 9

data want;
set have (firstobs=1 obs=10);
id1=SUBSTRN(id,13,10);
RUN;

 

given data:

 

ABC00000012365948952321
ABC00000000000948952321
ABC00000000025369214558

 

Output want: 

12365948952321

 948952321

25369214558

 

 

I have set of ID's that consist of 23 digits.
First 3 letters starts with alphabets and the rest are numbers.
I want to pull only the last numbers that starts after zero.
Since it's not constant position not sure how to use the function?

1 ACCEPTED SOLUTION

Accepted Solutions
kiranv_
Rhodochrosite | Level 12

Something like this. What we are doing is making three groups. Anything mentioned in parentheis is group.1 st group is alphabets 2nd group is anything with 0 i.e. in midddle third group is from 0-9. We are replacing everything with last group that is third group mentioned as $3

data abc;
input var $50.;
datalines;
ABC00000012365948952321
ABC00000000000948952321
ABC00000000025369214558
ABC000000000253692145580
;


data bde;
set abc;
var1 =prxchange('s/([A-Z]+)(0+)([0-9]+)/$3/i', -1, var);
run;

View solution in original post

15 REPLIES 15
Reeza
Super User

Use COMPRESS to remove any characters that are letters.

Use INPUT to read the remaining text as a number.

 

 

novinosrin
Tourmaline | Level 20

data have;
input var $25.;
datalines;
ABC00000012365948952321
ABC00000000000948952321
ABC00000000025369214558
;

 

data want;
set have;
new_var=compress(var,'abcdefghijklmnopqrstuvwxyz0');
run;

 

Regards,

Naveen Srinivasan

Reeza
Super User

@novinosrin It's not shown in the OP's example, but what if there was a 0 in the middle of the number? I think it's highly likely that's a possibility. 

 

ABC00000012365948052321
ABC00000000000948952321
ABC00000000025369214008

Kalai2008
Pyrite | Level 9

Thank you, I tried the compress function, but as you said its eliminating the 0 .

 

example:

 

ABC000000000025225250

ABC000000000025205250

 

 its eliminating the last zero and middle zero  as well.

Reeza
Super User

@Kalai2008 wrote:

Thank you, I tried the compress function, but as you said its eliminating the 0 .

 

example:

 

ABC000000000025225250

ABC000000000025205250

 

 its eliminating the last zero and middle zero  as well.


POST YOUR CODE. You can specify digits or characters using the 3rd or 4th parameter of the COMPRESS function, you don't have to list them at all. 

Kalai2008
Pyrite | Level 9

data chk;
set test1.CHK2(firstobs=1 obs=10);
acct=compress(acct_id,'ABC0');
RUN;

 

 

DATA:

ABC000000000014256320

ABC000000000140252301

 

Output Results :

1425632

1425231

 

kiranv_
Rhodochrosite | Level 12

Something like this. What we are doing is making three groups. Anything mentioned in parentheis is group.1 st group is alphabets 2nd group is anything with 0 i.e. in midddle third group is from 0-9. We are replacing everything with last group that is third group mentioned as $3

data abc;
input var $50.;
datalines;
ABC00000012365948952321
ABC00000000000948952321
ABC00000000025369214558
ABC000000000253692145580
;


data bde;
set abc;
var1 =prxchange('s/([A-Z]+)(0+)([0-9]+)/$3/i', -1, var);
run;

Kalai2008
Pyrite | Level 9

Wow..It worked..Thanks a lot! Saved my time! Good & clear  Explanation.

Reeza
Super User

PRX is a great solution, but which one of these can you understand and modify if necessary?

 

data want;
set abc;

*Keep only digits in the string;
x=compress(var, , 'kd');

*convert to numeric;
y=input(x, best32.);

*use substring to extract all text after 3/4th character and convert to number;
z=input(substr(var, 4), best32.);

*PRX formula;
w=prxchange('s/([A-Z]+)(0+)([0-9]+)/$3/i', -1, var);

format  y z  best32.;
format x w $32.;

run;
Kalai2008
Pyrite | Level 9
Thank you Reeza. Let me try everything
Kalai2008
Pyrite | Level 9
Somehow the code didn't picked this id,
It excluded the last zero.
ABCD0000000011423744160
ABCD0000000011129735160..

Anyways Thank you Reeza, will try to modify as per your solution.
Reeza
Super User

@Kalai2008 wrote:
Somehow the code didn't picked this id,
It excluded the last zero.
ABCD0000000011423744160
ABCD0000000011129735160..

Anyways Thank you Reeza, will try to modify as per your solution.

Note that I don't understand PRX, so the function used above is as the original poster posted.  It's a perfectly valid solution, just not easy to modify especially as a beginner. 

Kalai2008
Pyrite | Level 9

I am so sorry, the code worked . I did a mistake in the input function ( char to num). Thank you!

Ksharp
Super User
data abc;
input var $50.;
datalines;
ABC00000012365948952321
ABC00000000000948952321
ABC00000000025369214558
ABC000000000253692145580
;


data bde;
set abc;
var1 =prxchange('s/^[A-Z0]+//i', -1, var);
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 15 replies
  • 2429 views
  • 10 likes
  • 6 in conversation