Hi Experts,
I have got a request to transform alphanumeric field in a way that we are summing up the digits and adding to the numbers as below. Any characters on the left of the number is advanced by 2 position as per English alphabet (A-Z-A) , characters on the right of the number is lowered to one position, then numbers sandwiched in characters are added up in original number, as per below example.
Length of the value field could be max of 20 .
Value | Transformed value |
AB345C | DE357B |
P75EF | S87E |
Z7869Q | C7899P |
1169893 | 1169893 |
Your help is really appreciated
.
See this:
data have;
input instring :$20. want :$20.;
datalines;
AB345C DE357B
P75EF S87E
Z7869Q C7899P
1169893 1169893
;
data want;
set have;
ff = anydigit(instring);
if ff in (1,0)
then outstring = instring;
else do;
do i = 1 to ff - 1;
outstring = cats(outstring,byte(mod(rank(char(instring,i))-64+3,26)+64));
end;
fl = anyalpha(instring,ff+1);
num = input(substr(instring,ff,fl-ff),best.);
do i = ff to fl - 1;
num = num + input(char(instring,i),1.);
end;
outstring=cats(outstring,put(num,best.));
do i = fl to length(instring);
outstring = cats(outstring,byte(mod(rank(char(instring,i))-64-1,26)+64));
end;
end;
drop i ff fl num;
run;
What is the formula for getting
from 345 to 357
from 75 to 87
from 7869 to 7899
?
I guess: If old number is surrounded by letters, then new number = old number + sum of digits of old number.
357=345+(3+5+7)
Next code may result differently according to local "ascii" sequence.
Try next code:
data have;
infile datalines;
input string $20.;
datalines;
AB345C
P75EF
Z7869Q
1169893
; run;
data want;
set have;
length cx $1 new_str $20;
phase=0;
L = length(string); put L=;
number = input(compress(string,' ','kd'),best20.);
new_num = number;
new_str = repeat(' ',19);
do i=1 to L;
cx = substr(string,i,1);
if cx > '9' then do; /* assuming all digits precede alphbetic characters */
if phase=0 then do;
if cx = 'Z' and pahse=1 then substr(new_str,i,1) = 'B'; else
substr(new_str,i,1) = byte(rank(cx)+2);
end; else
if phase=2 then do;
if cx = 'A' then substr(new_str,i,1) = 'Z'; else
substr(new_str,i,1) = byte(rank(cx)-1);
end;
end;
else if ('0' le cx le '9') then do;
phase=1;
do j=i to L;
cx = substr(string,j,1);
if cx > '9' then leave;
new_num = new_num + input(cx,1.);
end;
new_str = cats(new_str,left(new_num)); put new_str=;
phase=2;
i = j;
if cx = 'A' then substr(new_str,i,1) = 'Z'; else
substr(new_str,i,1) = byte(rank(cx)-1);
leave;
end;
end;
new_str = strip(new_str);
DROP i j ;
run;
At row 2, assuming letter sequience is '...P Q R ...' then new string should start with R and not S as you displayed.
See this:
data have;
input instring :$20. want :$20.;
datalines;
AB345C DE357B
P75EF S87E
Z7869Q C7899P
1169893 1169893
;
data want;
set have;
ff = anydigit(instring);
if ff in (1,0)
then outstring = instring;
else do;
do i = 1 to ff - 1;
outstring = cats(outstring,byte(mod(rank(char(instring,i))-64+3,26)+64));
end;
fl = anyalpha(instring,ff+1);
num = input(substr(instring,ff,fl-ff),best.);
do i = ff to fl - 1;
num = num + input(char(instring,i),1.);
end;
outstring=cats(outstring,put(num,best.));
do i = fl to length(instring);
outstring = cats(outstring,byte(mod(rank(char(instring,i))-64-1,26)+64));
end;
end;
drop i ff fl num;
run;
Hello Kurt,
Many thanks for your response, apologizes, I forgot to add the digits on the fourth row. your solution is working perfectly fine except for the last as it was my mistake. I forgot to add digits for the last case. please see below.
Value | Transformed value |
AB345C | DE357B |
P75EF | S87DE |
Z7869Q | C7899P |
1169893 | 1169930 |
It would be great if you can help with this.
After editing my code it results as wanted:
data have;
infile datalines;
input string $20.;
datalines;
AB345C
P75EF
Z7869Q
1169893
; run;
data want;
set have;
length cx $1 new_str $20;
phase=0;
L = length(string); put L=;
number = input(compress(string,' ','kd'),best20.);
new_num = number;
new_str = repeat(' ',19);
do i=1 to L;
cx = substr(string,i,1);
if cx > '9' then do; /* assuming all digits precede alphbetic characters */
if phase=0 then do;
if cx = 'Z' then substr(new_str,i,1) = 'C'; else
substr(new_str,i,1) = byte(rank(cx)+3);
end; else
if phase=2 then do;
if cx = 'A' then substr(new_str,i,1) = 'Z'; else
substr(new_str,i,1) = byte(rank(cx)-1);
end;
end;
else if ('0' le cx le '9') then do;
phase=1;
do j=i to L;
cx = substr(string,j,1);
if cx > '9' then leave;
new_num = new_num + input(cx,1.);
end;
new_str = cats(new_str,left(new_num));
phase=2;
i = j;
if cx = 'A' then substr(new_str,i,1) = 'Z'; else
if cx > '9' then substr(new_str,i,1) = byte(rank(cx)-1);
*leave;
end;
end;
new_str = strip(new_str);
keep string new_str;
run;
Just for completeness: revised code that deals with a pure numeric value:
data have;
input instring :$20. want :$20.;
datalines;
AB345C DE357B
P75EF S87E
Z7869Q C7899P
1169893 1169893
;
data want;
set have;
length outstring $20;
ff = anydigit(instring);
do i = 1 to ff - 1;
outstring = cats(outstring,byte(mod(rank(char(instring,i))-64+3,26)+64));
end;
fl = anyalpha(instring,ff+1);
if fl = 0 then fl = length(instring) + 1;
num = input(substr(instring,max(ff,1),fl-ff),best.);
do i = max(ff,1) to fl - 1;
num = num + input(char(instring,i),1.);
end;
outstring=cats(outstring,put(num,best.));
do i = fl to length(instring);
outstring = cats(outstring,byte(mod(rank(char(instring,i))-64-1,26)+64));
end;
drop i ff fl num;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.