DATA Step, Macro, Functions and more

32 numeric Strings manipulation

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

32 numeric Strings manipulation

Hi SAS experts.

 

So I have a 41 length numeric string called ID that I want to encrypt by adding a number 263, and keep the format as string.

 

eg.

1037699710538224500199468064894 should become

1037699710538224500199468065157

 

This has been proving quite tricky to me as the strings are very long. I have profiled the length of this ID field and it varies from 17 to 31.

 

What I have tried in the below SAS is 

1.

part1= substring 1 to 13

part2= substring 14 to its last position

 

2. Cast part into number then add 263 which is my seed then cast this back to a string

 

3. Put the part1 and the adjusted part 2 together and add leading zeros if necessary.

 

proc sql;
create table output as select

/* original ID */
ID,

/*1 - test if, either casting into numeric or adding a 263 seed change the length of the second part of the ID(substring consisting from position 14 to its last charact)*/
length(strip(substr(strip(ID),14,length(ID)-14+1))) - length(strip(put(input(strip(substr(strip(ID),14,length(ID)-14+1)),20.)+263,20.))) as diff3,

/*2 - if the length of the second part reduced, append leading zero*/
case
when calculated diff3 > 0 then strip(repeat("0",calculated diff3-1))||strip(put(input(strip(substr(strip(ID),14,length(ID)-14+1)),20.)+263,20.))
else strip(put(input(strip(substr(strip(ID),14,length(ID)-14+1)),20.)+263,20.))
end as second_part3,


/*3 - Now append the two parts together so that we have the full encrypted string*/
strip(substr(ID,1,13))||strip(calculated second_part3) as E_ID
/******************************************************************************************************************************/

from input
;
quit;

 

I am a bit lost becuase this is not working properly.

The first observation with

    ID=1037699710538224500199468064894 becomes

E_ID=1037699710538224500199468065152

 

What am I doing wrong?

Can anyone think of a better approach to this exercise?

Many thanks

 


Accepted Solutions
Solution
‎10-19-2015 10:02 PM
Occasional Contributor
Posts: 8

Re: 32 numeric Strings manipulation

Hi SAS users. 

Thank you so much for discussing this matter with me.

After struggling for a whlie, I might have found out a solution that works ok.

I used a dataset which I don't really like as it can be messy but it seemed to me that this way suited the problem.

 

I am presenting this for those who are interested. Let me know if you find any bugs

 

Basically, my code executes the following:

1. Takes each digit of the a character and converts it to a number (where digit1 is the 1 st digit from the right)

2. Add to the 1st, 2nd and the 3rd digit the required seed 3, 6 and 2, respectively.

3. Identify if any of the digit, starting the 1st digit if they are over 10, subtract by 10 then add 1 to the "carryover" value for the next digit.

4. Keep doing this until the end of the number then in the last do loop, concatenate each seeded digit.

5. drop any intermediate values used in this process.

6. I did not find any values with error=1 (indicating the seeded number will go over 32 digit) as the original IDs luckily but if you do have sufficiently large numbers such as (999999999.....999) it may well do and this is one limitation of my code.

 

 

data output;
format e_ID $32.;
set input;

ARRAY digit {32} digit1-digit32;
ARRAY result {32} result1-result32;
ARRAY carryover {32} carryover1-carryover32;

/* Intialise the elements of the arrays */
DO i=1 TO length(ID);
digit{i}=input(substr(strip(ID),length(ID)-i+1,1),1.);
result{i}=0;
carryover{i}=0;
END;
DO i=1 TO length(ID);
if i=1 then do;
result{i}=sum(digit{i},3);
if result{i}>9 then do;
result{i}=result{i}-10;
carryover{i+1}=1;
end;
end;
else if i=2 then do;
result{i}=sum(digit{i},6,carryover{i});
if result{i}>9 then do;
result{i}=result{i}-10;
carryover{i+1}=1;
end;
end;
else if i=3 then do;
result{i}=sum(digit{i},2,carryover{i});
if result{i}>9 then do;
result{i}=result{i}-10;
carryover{i+1}=1;
end;
end;

else if 4<=i and i<=length(ID)-1 then do;
result{i}=sum(digit{i},carryover{i});
if result{i}>9 then do;
result{i}=result{i}-10;
carryover{i+1}=1;
end;
end;
if i=length(ID) then do;
result{i}=sum(digit{i},carryover{i});
if result{i}>9 then do;
error=1; 
end;
end;
END;

DO i=1 TO length(ID);
e_ID=strip(put(result{i},1.))||strip(e_ID);
END;

drop digit1--digit32 result1--result32 carryover1--carryover32error i;
run;

 

 

View solution in original post


All Replies
Respected Advisor
Posts: 4,649

Re: 32 numeric Strings manipulation

Could be done with fcmp:

 

proc fcmp outlib=work.funcs.myLib;     

function longSum(n $,m $) $64; 
length rn rm rr $64;
rn = reverse(n);
rn = translate(rn,"0"," ");
rm = reverse(m);
rm = translate(rm,"0"," ");

r = 0;
do i = 1 to 64;
    s = input(char(rn,i),1.) + input(char(rm,i),1.) + r;
    substr(rr,i,1) = put(mod(s,10), 1.0);
    r = s > 9;
    end;
rr = reverse(rr);
rr = substr(rr, findc(rr,"0","K"));
return(rr); 
endsub;
run; 

options cmplib=work.funcs; 

data test;
n = "1037699710538224500199468064894";
m = "263";
run;

proc sql;
select n, m, longSum(n,m) as r
from test;
quit;
PG
Respected Advisor
Posts: 3,124

Re: 32 numeric Strings manipulation

As you can see from @PGStats's answer, it is no easy way, mainly because of 'Carrying' issue.  And when you try to 'decrypt', the same process has to be done, reversely, again. I have a comment, but without knowing your whole picture, my suggestion may not fit your bill. As you may figure out already, handling digits that long as number is far beyond capacity of most mainstream systems, so why not eliminate the 'carry' issue by limiting the swap on single digit. For example, how about pick digits at position 2, 3, 6, then convert them into their difference from 10? say 0 -0, 2-8, 4-6?

This may save you significant amount of computing time depending on your data volume.  

 

Just my 2 cents,

Haikuo

SAS Super FREQ
Posts: 683

Re: 32 numeric Strings manipulation

[ Edited ]

With SAS 9.4 we have the DS2 langauge and more data types to work with. Among them is the data type DECIMAL which can be used to make this kind of computations. Please note, that DS2 can not write the new data types to a SAS data set. Find below a code sample illustrating this. DS2 data types https://support.sas.com/documentation/cdl/en/ds2ref/68052/HTML/default/viewer.htm#n0v130wmh3hmuzn1t7...

 

data have;
  charID = "1037699710538224500199468064894";
run;

proc ds2;
  data want(overwrite=yes);
    dcl char(31) charID2;
    method run();
      dcl decimal(32) nDec;
      set have;
      nDec = charID;
      nDec = nDec + 263;
      charID2 = nDec;
    end;
  enddata;
run;
quit;

proc print data=want;
run;

Bruno

Super User
Posts: 5,082

Re: 32 numeric Strings manipulation

The DATA step has limits as to the largest integer it can store accurately (typically 15 significant digits).  Your part2 values are too long in some cases.  Given that the syntax for your solution is working (but not the results), you could break the string up into three parts instead of two.  But work from the right-hand side:  part3 = last 14 digits, part2 in the 14 digits before part2, part1 = the 13 digits before part2.  That's assuming that you want to program this using your original approach, and not switch to the function method already suggested.

Super User
Super User
Posts: 7,401

Re: 32 numeric Strings manipulation

That would be my answer as well (note, not the best coding, it is there to show the logic):

data have;
  myid="1037699710538224500199468064894";
  block1=input(substr(myid,1,10),best.);
  block2=input(substr(myid,10,10),best.);
  block3=input(substr(myid,20),best.);
  block3=block3 + 263;
  if block3 > 1000000000 then do;
    block3=block3-1000000000;
    block2=block2+1;
    if block2 > 1000000000 then do;
      block2=block2-1000000000;
      block1=block1+1;
      if block1 > 1000000000 then do;
        block1=block1-1000000000;
        block4=1;
      end;
    end;
  end;
  length new_id $40.;
  if block4=. then new_id=cats(put(block1,z10.),put(block2,z10.),put(block3,z13.));
  else put "Problem found in carry over limit";
run;
Respected Advisor
Posts: 3,124

Re: 32 numeric Strings manipulation

The following code is inspired by @RW9's block approach, only to avoid the hard coding. The block size is 10 digits.

data have;
	charID = "1037699710538224500199468064894";
	output;
	charid = '1999999999999999999999999999999';
	output;
	charid = '9999999999999999999999999999997';
	output;
run;

data want;
	set have;
	length new_id $ 40;
	r_id=reverse(charid);

	do i=0 to int(length(r_id)/10);
		c=reverse(substr(r_id,i*10+1,10));

		if i=0 then
			n=c+263;
		else n=c+flag;
		flag=length(cats(n))>10;

		if i = int(length(r_id)/10) then
			new_id=cats(reverse(substr(reverse(n),1,10)), new_id);
		else new_id=cats(reverse(substr(reverse(put(n,z15.)),1,10)), new_id);
/*		output;*/
	end;
	keep charid new_id;
run;
Solution
‎10-19-2015 10:02 PM
Occasional Contributor
Posts: 8

Re: 32 numeric Strings manipulation

Hi SAS users. 

Thank you so much for discussing this matter with me.

After struggling for a whlie, I might have found out a solution that works ok.

I used a dataset which I don't really like as it can be messy but it seemed to me that this way suited the problem.

 

I am presenting this for those who are interested. Let me know if you find any bugs

 

Basically, my code executes the following:

1. Takes each digit of the a character and converts it to a number (where digit1 is the 1 st digit from the right)

2. Add to the 1st, 2nd and the 3rd digit the required seed 3, 6 and 2, respectively.

3. Identify if any of the digit, starting the 1st digit if they are over 10, subtract by 10 then add 1 to the "carryover" value for the next digit.

4. Keep doing this until the end of the number then in the last do loop, concatenate each seeded digit.

5. drop any intermediate values used in this process.

6. I did not find any values with error=1 (indicating the seeded number will go over 32 digit) as the original IDs luckily but if you do have sufficiently large numbers such as (999999999.....999) it may well do and this is one limitation of my code.

 

 

data output;
format e_ID $32.;
set input;

ARRAY digit {32} digit1-digit32;
ARRAY result {32} result1-result32;
ARRAY carryover {32} carryover1-carryover32;

/* Intialise the elements of the arrays */
DO i=1 TO length(ID);
digit{i}=input(substr(strip(ID),length(ID)-i+1,1),1.);
result{i}=0;
carryover{i}=0;
END;
DO i=1 TO length(ID);
if i=1 then do;
result{i}=sum(digit{i},3);
if result{i}>9 then do;
result{i}=result{i}-10;
carryover{i+1}=1;
end;
end;
else if i=2 then do;
result{i}=sum(digit{i},6,carryover{i});
if result{i}>9 then do;
result{i}=result{i}-10;
carryover{i+1}=1;
end;
end;
else if i=3 then do;
result{i}=sum(digit{i},2,carryover{i});
if result{i}>9 then do;
result{i}=result{i}-10;
carryover{i+1}=1;
end;
end;

else if 4<=i and i<=length(ID)-1 then do;
result{i}=sum(digit{i},carryover{i});
if result{i}>9 then do;
result{i}=result{i}-10;
carryover{i+1}=1;
end;
end;
if i=length(ID) then do;
result{i}=sum(digit{i},carryover{i});
if result{i}>9 then do;
error=1; 
end;
end;
END;

DO i=1 TO length(ID);
e_ID=strip(put(result{i},1.))||strip(e_ID);
END;

drop digit1--digit32 result1--result32 carryover1--carryover32error i;
run;

 

 

Super User
Super User
Posts: 7,401

Re: 32 numeric Strings manipulation

Just to note, that is effetively what I am doing in my code, except I do it by blocks of 10 characters, and hence have a lot less code.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 424 views
  • 4 likes
  • 6 in conversation