<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Split a variable to multiple variables in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Split-a-variable-to-multiple-variables/m-p/897292#M354560</link>
    <description>&lt;P&gt;Two of the responses you received build each split word-by-word, from left to right.&amp;nbsp; The other, using regular expressions, I suspect, counts columns under the hood from the left up to the split length, making sure to not truncate the rightmost word.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The code below takes a different tack.&amp;nbsp; It starts each split by looking at the &lt;EM&gt;&lt;STRONG&gt;rightmost&lt;/STRONG&gt;&lt;/EM&gt; blank candidate column and stepping&amp;nbsp;&lt;EM&gt;&lt;STRONG&gt;backward&lt;/STRONG&gt;&lt;/EM&gt; until a blank is located.&amp;nbsp; Then the entire resulting split is copied, and the next leftmost column and rightmost blank candidate are established, and the same approach is used for the next split.&amp;nbsp; I use the term "&lt;EM&gt;&lt;STRONG&gt;rightmost blank candidate&lt;/STRONG&gt;&lt;/EM&gt;" because for a split of, say 100 bytes length, the rightmost valid blank candidate in the source would be byte number 101, just after the maximum split length.&amp;nbsp; It's only a candidate, however, until a word separator is encountered.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The thought here is that there is likely a lot fewer comparisons taking place searching for a blank in this way, than there is in assembling each split from the left.&amp;nbsp; Especially for wide splits.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
comment='Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.';
run;


%let L_source=1000;
%let L_split=100;
%let nsplits=%eval( (&amp;amp;L_source + &amp;amp;L_split-1)/&amp;amp;L_split );

data want (drop=_: i);
  set have ;

  array split {&amp;amp;nsplits} $&amp;amp;L_split;

  _left_col=1;
  do i=1 to &amp;amp;nsplits until (substr(comment,_right_col)=' ' or _left_col&amp;gt;length(comment));
    do _right_col=min(length(comment),_left_col+&amp;amp;L_split+1) by -1 until (char(comment,_right_col)=' ' or _right_col=length(comment));
    end;
    split{i}=substr(comment,_left_col,_right_col+1-_left_col);
    _left_col=min(_right_col+1,length(comment)+1);
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Notes on the code:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;The "do i=1 to &amp;amp;nsplits ..." keeps going until either (a) only trailing blanks remain, or (b) the left column for a new split is beyond the length of the source variable.&amp;nbsp; This can happen if you have a source variable storage length that is not an exact multiple of the split length.&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;
&lt;LI&gt;The second DO loop finds the appropriate value for the rightmost candidate for a blank column (or in the case of the last split, it will be the last character of the source variable, which might not be blank).&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;
&lt;LI&gt;The next two lines simply copy the current split and updates the left column for use in determining the next split.&lt;/LI&gt;
&lt;/OL&gt;</description>
    <pubDate>Thu, 05 Oct 2023 03:33:52 GMT</pubDate>
    <dc:creator>mkeintz</dc:creator>
    <dc:date>2023-10-05T03:33:52Z</dc:date>
    <item>
      <title>Split a variable to multiple variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Split-a-variable-to-multiple-variables/m-p/896925#M354430</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am trying to split a string,&lt;/P&gt;
&lt;P&gt;comment='&lt;SPAN&gt;Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum' .&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;using a simple code line,&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;if length(comment) &amp;gt;0 then a=strip(substr(comment,1,200));

if length(comment) &amp;gt;200 then b=strip(substr(comment,201,200));&amp;nbsp;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;SPAN&gt;so on, on a basic code, can we upgrade this so it doesn't allow word chopping, for example, the word 'tempor' gets pushed to a new var or be kept in the same var if the length of 200 permits otherwise the entire word gets pushed to new var from first letter without chopping the word as 'tem' 'por' to two different variables.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Please advise.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Tue, 03 Oct 2023 12:30:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Split-a-variable-to-multiple-variables/m-p/896925#M354430</guid>
      <dc:creator>bharath86</dc:creator>
      <dc:date>2023-10-03T12:30:13Z</dc:date>
    </item>
    <item>
      <title>Re: Split a variable to multiple variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Split-a-variable-to-multiple-variables/m-p/896932#M354436</link>
      <description>Please look at the thread: &lt;A href="https://communities.sas.com/t5/SAS-Programming/splitting-text-which-is-more-than-200char-to-multiple-variables/td-p/333393" target="_blank"&gt;https://communities.sas.com/t5/SAS-Programming/splitting-text-which-is-more-than-200char-to-multiple-variables/td-p/333393&lt;/A&gt;</description>
      <pubDate>Tue, 03 Oct 2023 13:04:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Split-a-variable-to-multiple-variables/m-p/896932#M354436</guid>
      <dc:creator>JosvanderVelden</dc:creator>
      <dc:date>2023-10-03T13:04:28Z</dc:date>
    </item>
    <item>
      <title>Re: Split a variable to multiple variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Split-a-variable-to-multiple-variables/m-p/896940#M354443</link>
      <description>&lt;P&gt;Try like that:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
comment='Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.';
run;
proc print;
run;


%macro cutMeIntoParts(
 have
,variable
,want=want
,L=100
,keep=part
);

data &amp;amp;want.;
  set &amp;amp;have.;

  length part $ &amp;amp;L.;

  do i=1 by 1;
    word = scan(&amp;amp;variable., i, " ");

    if word = " " then 
      do;
        output;
        leave;
      end;
    /*
    put word= / part =;
    */

    if length(catx(" ",part, word))&amp;gt;&amp;amp;L. then
      do;
        output;
        part = word;
      end;
    else 
      part = catx(" ",part, word);
   
  end;

  keep &amp;amp;keep.;
run;

%mend cutMeIntoParts;

%cutMeIntoParts(have,comment,L=50)


proc print data=want(keep=part);
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Bart&lt;/P&gt;</description>
      <pubDate>Tue, 03 Oct 2023 13:28:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Split-a-variable-to-multiple-variables/m-p/896940#M354443</guid>
      <dc:creator>yabwon</dc:creator>
      <dc:date>2023-10-03T13:28:35Z</dc:date>
    </item>
    <item>
      <title>Re: Split a variable to multiple variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Split-a-variable-to-multiple-variables/m-p/897090#M354501</link>
      <description>&lt;P&gt;Based on the sample code found in the docu for the &lt;A href="https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lefunctionsref/n1obc9u7z3225mn1npwnassehff0.htm" target="_self"&gt;CALL PRXNEXT Routine&lt;/A&gt; below should work.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;%let l_source=1000;
%let l_part  =200;
%let n_parts =%eval(&amp;amp;l_source/&amp;amp;l_part +1);

data demo(drop=_:);
  length text $&amp;amp;l_source;
  array text_parts_ {&amp;amp;n_parts} $&amp;amp;l_part;
  _prxid = prxparse("/\b(.{1,%eval(&amp;amp;l_part-1)}\b)/");
  text='Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum';
  _start = 1;
  _stop = length(text);

  /* Use PRXNEXT to find the first instance of the pattern, */
  /* then use DO WHILE to find all further instances.       */
  /* PRXNEXT changes the _start parameter so that searching  */
  /* begins again after the last match.                     */
  call prxnext(_prxid, _start, _stop, text, _pos, _len);

  do while (_pos &amp;gt; 0);
    _i=sum(_i,1);
    text_parts_[_i] = substr(text, _pos, _len);
    call prxnext(_prxid, _start, _stop, text, _pos, _len);
  end;
run;

proc print data=demo;
  var text_parts_:;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Patrick_0-1696386886582.png" style="width: 708px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/88569i70DACFE7F431D519/image-dimensions/708x80?v=v2" width="708" height="80" role="button" title="Patrick_0-1696386886582.png" alt="Patrick_0-1696386886582.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 04 Oct 2023 02:35:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Split-a-variable-to-multiple-variables/m-p/897090#M354501</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2023-10-04T02:35:16Z</dc:date>
    </item>
    <item>
      <title>Re: Split a variable to multiple variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Split-a-variable-to-multiple-variables/m-p/897096#M354502</link>
      <description>&lt;P&gt;Just for having some fun.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
id+1;
length comment $ 20000;
comment='Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum' ;
run;
data temp;
 set have;
 length token temp want $ 400 want2 $ 200;
 if length(comment)&amp;gt;200 then do;
 do i=1 to countw(comment,' ');
  token=scan(comment,i,' ');
  temp=want;
  want=catx(' ',want,token);
    if length(want)&amp;gt;200 then do;
      want=temp;want2=want;output;want=token;
    end;
  end;
  if not missing(want) then do;want2=want; output;end;
  end;
  else do;want2=comment; output;end;
  keep id want2;
run;
proc transpose data=temp out=want;
by id;
var want2;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 04 Oct 2023 03:20:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Split-a-variable-to-multiple-variables/m-p/897096#M354502</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2023-10-04T03:20:22Z</dc:date>
    </item>
    <item>
      <title>Re: Split a variable to multiple variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Split-a-variable-to-multiple-variables/m-p/897292#M354560</link>
      <description>&lt;P&gt;Two of the responses you received build each split word-by-word, from left to right.&amp;nbsp; The other, using regular expressions, I suspect, counts columns under the hood from the left up to the split length, making sure to not truncate the rightmost word.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The code below takes a different tack.&amp;nbsp; It starts each split by looking at the &lt;EM&gt;&lt;STRONG&gt;rightmost&lt;/STRONG&gt;&lt;/EM&gt; blank candidate column and stepping&amp;nbsp;&lt;EM&gt;&lt;STRONG&gt;backward&lt;/STRONG&gt;&lt;/EM&gt; until a blank is located.&amp;nbsp; Then the entire resulting split is copied, and the next leftmost column and rightmost blank candidate are established, and the same approach is used for the next split.&amp;nbsp; I use the term "&lt;EM&gt;&lt;STRONG&gt;rightmost blank candidate&lt;/STRONG&gt;&lt;/EM&gt;" because for a split of, say 100 bytes length, the rightmost valid blank candidate in the source would be byte number 101, just after the maximum split length.&amp;nbsp; It's only a candidate, however, until a word separator is encountered.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The thought here is that there is likely a lot fewer comparisons taking place searching for a blank in this way, than there is in assembling each split from the left.&amp;nbsp; Especially for wide splits.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
comment='Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.';
run;


%let L_source=1000;
%let L_split=100;
%let nsplits=%eval( (&amp;amp;L_source + &amp;amp;L_split-1)/&amp;amp;L_split );

data want (drop=_: i);
  set have ;

  array split {&amp;amp;nsplits} $&amp;amp;L_split;

  _left_col=1;
  do i=1 to &amp;amp;nsplits until (substr(comment,_right_col)=' ' or _left_col&amp;gt;length(comment));
    do _right_col=min(length(comment),_left_col+&amp;amp;L_split+1) by -1 until (char(comment,_right_col)=' ' or _right_col=length(comment));
    end;
    split{i}=substr(comment,_left_col,_right_col+1-_left_col);
    _left_col=min(_right_col+1,length(comment)+1);
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Notes on the code:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;The "do i=1 to &amp;amp;nsplits ..." keeps going until either (a) only trailing blanks remain, or (b) the left column for a new split is beyond the length of the source variable.&amp;nbsp; This can happen if you have a source variable storage length that is not an exact multiple of the split length.&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;
&lt;LI&gt;The second DO loop finds the appropriate value for the rightmost candidate for a blank column (or in the case of the last split, it will be the last character of the source variable, which might not be blank).&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;
&lt;LI&gt;The next two lines simply copy the current split and updates the left column for use in determining the next split.&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Thu, 05 Oct 2023 03:33:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Split-a-variable-to-multiple-variables/m-p/897292#M354560</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2023-10-05T03:33:52Z</dc:date>
    </item>
    <item>
      <title>Re: Split a variable to multiple variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Split-a-variable-to-multiple-variables/m-p/897313#M354570</link>
      <description>&lt;P&gt;One more (just for fun) approach with "funcy filename" and "fast _infile_ read" inspired by:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://communities.sas.com/t5/SAS-Programming/How-to-delimit-large-dataset-28-Million-rows-into-700-variables/m-p/486946#M287206" target="_blank"&gt;https://communities.sas.com/t5/SAS-Programming/How-to-delimit-large-dataset-28-Million-rows-into-700-variables/m-p/486946#M287206&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Data:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
length comment $ 32767;
comment='Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.';
output;
comment="Split a variable to multiple variables. 
Hi, I am trying to split a string, comment='....'. using a simple code line.
So on, on a basic code, can we upgrade this so it doesn't allow word chopping, for example, the word 'tempor' gets pushed to a new var or be kept in the same var if the length of 200 permits otherwise the entire word gets pushed to new var from first letter without chopping the word as 'tem' 'por' to two different variables. Please advise. Thanks";
output;
run;
proc print;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Code:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;%macro cutMeIntoParts2(
 have
,variable
,want=want
,L = 100
,WN = 200  /* max. expected number of words */
,keep=part
);
filename f "!SASROOT/*.cfg";
data &amp;amp;want.;
  set &amp;amp;have.; 
  infile f truncover dlm = " ";
  input @1 @;
  _infile_ = &amp;amp;variable.;
  input @1 (m1-m&amp;amp;WN.) (: $128.) @@;

  mmX=" ";
  length part $ &amp;amp;L.;

  array mm $ m1--m&amp;amp;WN. mmX;
  do over mm;
    L = lengthn(mm);
    
    if P + L &amp;gt; &amp;amp;L.-1 then 
      do; 
        output; 
        part = " "; 
      end;
    if L = 0 then 
      do; 
        output; 
        leave; 
      end;

    part = catx(" ", part, mm);
    P = lengthn(part);
  end;
  keep &amp;amp;keep.; 
run;
filename f clear;


%mend cutMeIntoParts2;

%cutMeIntoParts2(have,comment,L=50)


proc print data=want(keep=part);
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Bart&lt;/P&gt;</description>
      <pubDate>Thu, 05 Oct 2023 06:58:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Split-a-variable-to-multiple-variables/m-p/897313#M354570</guid>
      <dc:creator>yabwon</dc:creator>
      <dc:date>2023-10-05T06:58:59Z</dc:date>
    </item>
  </channel>
</rss>

