I'm sorry, I'm still not sure that I understand what you mean. Your algorithm would work similarly to the first suggested, i.e., it would take strings part by part, wouldn't it? I can't do it, because, as I wrote, I need to deal with the whole string at once. (And I'm afraid that scan and tranwrd functions can't handle input/output above 32767 characters.)
To get aligned, from your example, I'd need something like the following code, which
data test;
length A B $32000;
A = repeat('A', 31000);
A = catt(A," aa");
B = repeat('B', 31000);
B = catt("bbb", B);
output;
run;
data test1;
set test;
array strings A B ;
length=10;
left = substrn(strings[1],1,31002);
right = substrn(strings[1],31003,62005);
both = substr(tranwrd(catt(left,right),"aabb","xxyy"), 30000, 5000);
l=length(both);
run;
If you suggest using arrays, I've already tried a similar modification:
data test;
length A B $32000;
A = repeat('A', 31000);
A = catt(A," aa");
B = repeat('B', 31000);
B = catt("bbb", B);
output;
run;
data test2;
set test;
array strings A B ;
length=10;
both = substr(tranwrd(strings,"aabb","xxyy"), 30000, 5000);
l=length(both);
run;
You mentioned that you are not sure how to get input or output longer and shorter. The two input strings are below the limit, processing part is above it, result is again below the limit. It's similar to:
tmp1 = catt(A, B); /* Very long string */
tmp2 = tranwrd(tmp1,"aabb","xxyy"); /* Processing */
out = substr(tmp2, 30000, 5000); /* The output has 5000 characters */
Edit: I need to work with the whole string, because, e.g., it is necessary to find substring such as
scan(tmp, index, '()');
Here, index may dynamically vary, as well as lengths of both tmp and output. The same holds for tranwrd arguments.
... View more