Hi,
I am trying to split a character variable (DVTERM) dynamically, so that it goes up to 200 characters for the first term, and then up to 199 for each succeeding term. For example if a character variable is 1000 characters long, then it would get split 5 times.
Note: i cannot split words, each new term has to start with a new word and it can't be a truncated word
So basically, the character limits are:
DVT1=200
DVT2=199
DVT3=199
DVT4=199
DVT5=199
...
I have found some sources online that can help me in regards to this, but bc of i'm not a savvy programmer, i'm having difficulty adapting the code so that it extends out more terms than just the 2 listed in the example.
data have;
set source.dv;
describe_deviation=strip(compbl(dvterm));
length=length(dvterm);
c=substr(dvterm,201,1);
if .<length<=200 and missing(c) then dvterm=dvt1;
else if 200<length<=400 or not missing(c) then do;
if substr(dvterm,200,1)="" or substr(dvt1,201,1)="" then do;
dvt1=substr(dvterm,1,200);
dvt2=strip(substr(dvterm,201));
end;
else do;
length1=200-length(scan(substr(dvterm,1,200),-1,""));
dvt1=substr(dvterm,1,length1);
dvt2=strip(substr(dvterm,length1+1,200));
end;
end;
run;
source: https://www.lexjansen.com/phuse-us/2021/ct/PAP_CT07.pdf
Here is an example data set about random rainbow facts. For the sake of simplicity, please adapt the code so that instead of 200, it's 20. And instead of 199 for the succeeding split off terms, please make it 19.
DATA rainbow_facts;
infile datalines dsd dlm=",";
LENGTH Fact $80.;
INPUT Fact $;
length=length(fact);
DATALINES;
Rainbows,
Rainbows are a mix of light refraction and dispersion.,
Rainbows have inspired countless myths and legends.,
The longest observed rainbow lasted for nine hours in Taiwan.,
Each rainbow color emerges due to different wavelengths of light.,
Rainbows are not physical objects- they cannot be approached or touched.
;
RUN;
Thank you
... View more