Hi Guys,
I have a file with a long address string called "Address". I have to create a file with a max of 3 address columns, let's call them address 1-2-3. Each column has a max length of 36 characters and I don't want to break up part of a word to move over to the next line.
I need something that will break up the string into columns of the desired length taking into account complete words. In other words not split a word in half to create a new line, instead if the particular word would make it go over 36 characters to then move that word on to the next column and so forth.
For example, if this were the address line:
This is an example of a really long address string which needs to be split into columns of 36 characters |
Into something like:
This is an example of a really long | address string which needs to be | split into columns of 36 characters |
35 | 33 | 35 |
Any suggestions would be much appreciated. I looked at some of the samples posted but they all seem to revolve around delimited long strings.
Thanks!
Note that this code assumes the string is no longer than 3 * 36:
data want; s="This is an example of a really long address string which needs to be split into columns of 36 characters"; array cols{3} $36; c=1; do i=1 to countw(s," "); if lengthn(catx(" ",cols{c},scan(s,i," "))) <= 36 then cols{c}=catx(" ",cols{c},scan(s,i," ")); else do; c=c+1; cols{c}=scan(s,i," "); end; end; run;
Note that this code assumes the string is no longer than 3 * 36:
data want; s="This is an example of a really long address string which needs to be split into columns of 36 characters"; array cols{3} $36; c=1; do i=1 to countw(s," "); if lengthn(catx(" ",cols{c},scan(s,i," "))) <= 36 then cols{c}=catx(" ",cols{c},scan(s,i," ")); else do; c=c+1; cols{c}=scan(s,i," "); end; end; run;
A slightly different (and maybe less elegant) take from me:
data want;
set have;
length address1-address3 $36;
array adr {3} address1-address3;
index = 1;
do count = 1 to countw(address);
if length(adr{index}) + 1 + length(scan(address,count)) > 36
then do;
index = index + 1;
adr{index} = scan(address,count);
end;
else adr{index} = catx(' ',adr{index},scan(address,count));
end;
drop count index;
run;
Didn't want to discard it just because @RW9 beat me to it 😉
If you use the 'old text editor' command line
000001 This is a very long string more than 36 characters. Split to 36 strings. More useless text.
Just type TF36 in the prefix area
TF3601 This is a very long string more than 36 characters. Split to 36 strings. More useless text.
/* T0100520 Hits #24 Optimum splits for long text strings, datastep linear regression and file attributes
Other hits on the end of this email
HAVE
All randomized subjects who receive at least one dose of study drug will be
considered evaluable for safety. All adverse events will be included in the analysis of safety after
randomization and prior to a subject entering the followup phase. The definition of the followup
phase (above) describes those adverse events that will be collected and analyzed during the followup pha
The Full Analysis Set and the Safety Subset
will each include all randomised
subjects who receive at least one dose of study
drug.
All subjects who have
signed informed consent before invasive, protocol specified procedures
(including study specific blood draws for laboratory testing and study chemotherapy) and have
received at least one dose of study drug will be included in the safety evaluable set. These subjects
will be analyzed according to the treatment they actually received. Summaries of safety data for the
treatment period will be provided on this safety evaluable set.');
WANT
Obs STR
1 All randomized subjects who receive at
2 least one dose of study drug will be
3 considered evaluable for safety. All
4 adverse events will be included in the
5 analysis of safety after randomization
6 and prior to a subject entering the
7 followup phase. The definition of the
8 followup phase (above) describes those
9 adverse events that will be collected
10 and analyzed during the followup phase.
11 The Full Analysis Set and the Safety
12 Subset will each include all randomised
13 subjects who receive at least one dose
14 of study drug. All subjects who have
15 signed informed consent before invasive,
16 protocol specified procedures (including
17 study specific blood draws for
18 laboratory testing and study
19 chemotherapy) and have received at least
20 one dose of study drug will be included
21 in the safety evaluable set. These
22 subjects will be analyzed according
23 to the treatment they actually received.
24 Summaries of safety data for the
25 treatment period will be provided
26 on this safety evaluable set.
WORKING CODE
proc template;
...
flow=on;
width=40; * this is whre we set length;
just=l;
FULL SOLUTION
/* T00388X TECHNIQUE FOR WRAPPING A LONGE STRING ie 32,000 CHAR STRING INTO MUTIPLE 40 BYTE STRINGS WITH NICE SPLITS
options noquotelenmax;
data rpt;
length lyn $32000;
lyn=compbl('
All randomized subjects who receive at least one dose of study drug will be
considered evaluable for safety. All adverse events will be included in the analysis of safety after
randomization and prior to a subject entering the followup phase. The definition of the followup
phase (above) describes those adverse events that will be collected and analyzed during the followup phase.
The Full Analysis Set and the Safety Subset
will each include all randomised
subjects who receive at least one dose of study
drug.
All subjects who have
signed informed consent before invasive, protocol specified procedures
(including study specific blood draws for laboratory testing and study chemotherapy) and have
received at least one dose of study drug will be included in the safety evaluable set. These subjects
will be analyzed according to the treatment they actually received. Summaries of safety data for the
treatment period will be provided on this safety evaluable set.');
output;
run;
libname odslib v9 "%sysfunc(pathname(work))";
ods path odslib.templates sashelp.tmplmst work.templates(update);
/* put the template in work.templates */
proc template;
define table rolchr;
classlevels=on;
order_data=on;
col_space_max=1;
col_space_min=1;
define column rol;
generic=on;
blank_dups=on;
flow=on;
width=40; * this is whre we set length;
just=l;
header=' ';
end;
end;
run;
options nodate nonumber ps=5000 ls=140;
title;footnote;
ods listing file="d:/txt/rolchr.txt";
data _null_;
retain cnt -1;
set rpt end=dne;
file print ods=(template='rolchr' columns=(rol=lyn (generic=on)));
put _ods_;
run;
quit;
ods path close;
ods listing close;
ods listing;
data str;
infile "d:/txt/rolchr.txt";
input str $40.;
run;
proc print data=str;
run;
ods path reset;
000001 This is a very long string more than
000002 36 characters. Split to 36 strings.
000003 More useless text.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.