Hi,
Dear member
Can you please let me know How to split a content more than 200 chr. using no macros, i mean with a datasetp.
For eg. if a content given is more than 500chars. in a datastep should split according to it's size(200chars.) and without truncating words.
eg. var = '
But multiple notes over the past three weeks have undermined the shock last November, wiping out 86% of the money in circulation in a cash-driven economy.
The demonetisation exercise has been called a watershed for a country saddled with counterfeiters pushing millions of fake notes into the Indian economy from neighbouring Pakistan, Bangladesh and Nepal. Terrorists and governments hostile to India use the bogus cash to weaken the economy.
The latest attempts to restart the vicious cycle after the notes ban have set off alarm bells in the security establishment.
'
to var, var1 and var2, excess content should split from var to var1 and var2.
Thanks
-santosh
Here are some DATA step statements to accomplish this:
length var1 $ 500 var2 $ 200;
if length(var) >= 200 then do;
do i=201 to 1 by -1 until (var1 > ' ');
if substr(var, i, 1)=' ' then do;
var1 = left(substr(var, i+1));
var = substr(var, 1, i);
end;
end;
end;
if length(var1) >= 200 then do;
do i=201 to 1 by -1 until (var2 > ' ');
if substr(var1, i, 1)=' ' then do;
var2 = left(substr(var1, i+1));
var1 = substr(var1, 1, i);
end;
end;
end;
It's untested, but should be fine as is.
Here are some DATA step statements to accomplish this:
length var1 $ 500 var2 $ 200;
if length(var) >= 200 then do;
do i=201 to 1 by -1 until (var1 > ' ');
if substr(var, i, 1)=' ' then do;
var1 = left(substr(var, i+1));
var = substr(var, 1, i);
end;
end;
end;
if length(var1) >= 200 then do;
do i=201 to 1 by -1 until (var2 > ' ');
if substr(var1, i, 1)=' ' then do;
var2 = left(substr(var1, i+1));
var1 = substr(var1, 1, i);
end;
end;
end;
It's untested, but should be fine as is.
Hi.
The following may work for you.
%let NUM_OF_STR=5;
%let CHR_PER_STR=40;
data _null_;
array V {*} $ &CHR_PER_STR V1-V&NUM_OF_STR;
S='the demonetisation exercise has been called a watershed for a country saddled with counterfeiters pushing';
_I=1; _J=1;
do until (scan(S,_I,' ') eq '');
_W=scan(S,_I,' ');
if lengthn(V[_J])+lengthn(_W) > lengthc(V[_J]) then _J+1;
V[_J]=strip(V[_J])!!' '!!_W;
_I+1;
end;
run;
Nothing fancy, it will split the text word by word and ditribute it by the number of desired variables.
This will also remove consecutive blanks between words if any.
Daniel Santos @ www.cgd.pt
Oh, I'm surely not a genius, but thanks for the compliment.
And no problem with that. If you want to set only CHR_PER_STR and have NUM_OF_STR calculated dynamically, instead of:
%let NUM_OF_STR=5;
%let CHR_PER_STR=40;
use:
%let CHR_PER_STR=200;
data _null_;
set HAVE;
call symputn('NUM_OF_STR',(lengthc(S)+&CHR_PER_STR-1)/&CHR_PER_STR);
run;
This will set NUM_OF_STR according to S's size and the specified CHR_PER_STR.
Daniel Santos @ www.cgd.pt
The two suggestions offered may work (I don't have time to test them) and, if so, you can disregard this one. SAS notes has published a macro that splits lines of text (see: http://support.sas.com/kb/30/649.html ). The following is a slight bastardization of that macro that meets your current need. You should note that your example text line is longer than 500 characters.
data have; informat var $char600.; infile 'c:\art\testdata.txt' lrecl=600; input var &; ; data want (keep=var:); array var(3) $200.; length textin $600; set have (rename=var=_var); textin=_var; i=1; if lengthn( textin ) <= 200 then var(i)=textin; else do while( lengthn( textin ) > 200 ) ; var(i) = reverse( substr( textin, 1, 200 )) ; ndx = index( var(i), ' ' ) ; if ndx then do ; var(i) = reverse( substr( var(i), ndx + 1 )) ; i+1; textin = substr( textin, 200 - ndx + 1 ) ; end ; else do; var(i) = substr(textin,1,200); i+1; textin = substr(textin,200+1); end; if lengthn( textin ) le 200 then var(i)=textin; end ; run;
HTH,
Art, CEO, AnalystFinder.com
Search it firstly at this community. There are already a couple of questions have been answered.
options noquotelenmax;
data have;
id=1;
var = '
But multiple notes over the past three weeks have undermined the shock last November, wiping out 86% of the money in circulation in a cash-driven economy.
The demonetisation exercise has been called a watershed for a country saddled with counterfeiters pushing millions of fake notes into the Indian economy from neighbouring Pakistan, Bangladesh and Nepal. Terrorists and governments hostile to India use the bogus cash to weaken the economy.
The latest attempts to restart the vicious cycle after the notes ban have set off alarm bells in the security establishment.
';
run;
data temp;
set have;
do i=1 to countw(var,' ');
word=scan(var,i,' ');output;
end;
drop i var;
run;
data temp;
set temp;
by id;
if first.id then sum=0;
sum+(length(word)+1);
if sum gt 200 then do;group+1;sum=length(word);end;
drop sum ;
run;
data x;
length x $ 200;
do until(last.group);
set temp;
by id group;
x=catx(' ',x,word);
end;
run;
proc transpose data=x out=want;
by id ;
var x;
run;
proc print;run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.