<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Collapsing string values within by-groups in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/890932#M352024</link>
    <description>&lt;P&gt;I think I am correctly interpreting your rule by saying:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;If a character in a string ever contains a 1, the final character should be a 1&lt;/LI&gt;
&lt;LI&gt;Else if that character ever contains a 2, the final character should be a 2&lt;/LI&gt;
&lt;LI&gt;Otherwise it is a zero.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If that is correct, and if your data are already grouped by ID/CLASS, then:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input ID  CLASS STRING :$36.;
datalines;
10002 973 000000000000011111000000000000111110
10002 934 010122121110000000000000000000000000
10002 934 020222111100000000000000000000000000
;

data want (drop=i);
  set have;
  by id class notsorted;
  array matrix{0:2,36} _temporary_;
  if first.class=1 then call missing(of matrix{*});

  /* If this is not a single obs, then update the 3-row matrix */
  if not (first.class=1 and last.class=1) then do i=1 to 36 ;
    matrix(input(char(string,i),1.),i)+1;
  end;

  if last.class;
  /* If this is not a single obs, then interpret the 3-row matrix*/
  if first.id=0 then do i=1 to length(string);
    if      matrix(1,i)&amp;gt;0 then substr(string,i,1)='1'; 
    else if matrix(2,i)&amp;gt;0 then substr(string,i,1)='2'; 
    else                       substr(string,i,1)='0';
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 25 Aug 2023 01:57:48 GMT</pubDate>
    <dc:creator>mkeintz</dc:creator>
    <dc:date>2023-08-25T01:57:48Z</dc:date>
    <item>
      <title>Collapsing string values within by-groups</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/890922#M352020</link>
      <description>&lt;P&gt;I have a long dataset with a 36-char STRING variable by ID and CLASS level. Some ID's CLASS values have multiple STRING values and I need to collapse the STRING by unique ID-CLASS group. Each character in the string represents a month and values should be collapsed as follows: if month X is 0 across all STRING within an ID-CLASS group, then month X should be 0 in final string; if month X is a 1 in &lt;EM&gt;any&lt;/EM&gt; STRING&amp;nbsp;within an ID-CLASS group, then month X should be 1; and if there are only 2s or 2s and 0s in month X in all STRING within an ID-CLASS group, then month X should be 2.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So for ID #1002, the final data should contain one row for CLASS #973, with the same info as before, and only one row for CLASS #934 where the STRING would be 010122111110000000000000000000000000.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;TABLE&gt;
&lt;THEAD&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;ID&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;CLASS&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;STRING&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P&gt;10002&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;973&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;000000000000011111000000000000111110&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P&gt;10002&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;934&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;010122121110000000000000000000000000&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P&gt;10002&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;934&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;020222111100000000000000000000000000&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I tried creating 36 month-level indicators based on the string and using them below. It's working in some cases, but not in others. I think when an earlier string contains a 2 and a later one has a 1, it updates to 1 correctly. But when a 1 appears in an earlier string and a 2 in a later string, it changes the 1 to 2.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;data dsn;&lt;/P&gt;
&lt;P&gt;length newstring $36;&lt;/P&gt;
&lt;P&gt;set dsn;&lt;/P&gt;
&lt;P&gt;by id class;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;array mstr (36) STR01-STR36;&lt;/P&gt;
&lt;P&gt;array nstr (36) NSTR01-NSTR36;&lt;/P&gt;
&lt;P&gt;do i=1 to 36;&lt;/P&gt;
&lt;P&gt;if first.unitid then nstr[i] = 0;&lt;/P&gt;
&lt;P&gt;end;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;retain nstr01--nstr36;&lt;/P&gt;
&lt;P&gt;do i=1 to 36 ;&lt;/P&gt;
&lt;P&gt;if mstr[i] &amp;gt; 0 then nstr[i] = min(of mstr[i]);&lt;/P&gt;
&lt;P&gt;end;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;newstring = cats(of nstr:);&lt;/P&gt;
&lt;P&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'd appreciate any tips or suggestions. Thanks.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 24 Aug 2023 21:22:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/890922#M352020</guid>
      <dc:creator>mh04</dc:creator>
      <dc:date>2023-08-24T21:22:56Z</dc:date>
    </item>
    <item>
      <title>Re: Collapsing string values within by-groups</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/890928#M352021</link>
      <description>&lt;P&gt;In one respect I would say that your rules mean that the values picked for the 0,1, and 2 were chosen poorly as there is no natural order of the shown value compared to your apparent concern for the value.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This approach works for the shown values by 1) treating 1 as the most important value by changing it to a numeric 9, then using the max summary statistic after separating out the columns of the value. Then reassigning the 9 back to a 1 when rebuilding the string. Though I would submit that the 36 character string is an awkward bulky and unnatural way to handle 36 values (or 3 or 15 or 125) values. I would set a coding scheme that makes sense in a more general term and likely deal with 36 variables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note use of custom informat to create numeric values from the character and the a custom format to allow simpler put of values as desired into a single string.&lt;/P&gt;
&lt;PRE&gt;data have;
input ID  CLASS STRING :$36.;
datalines;
10002 973 000000000000011111000000000000111110
10002 934 010122121110000000000000000000000000
10002 934 020222111100000000000000000000000000
;

proc format;
invalue stupidorder 
1=9
0=0
2=2;
value stupidorder
9='1'
0='0'
2='2'
;
run;

data working;
   set have;
   array ns (36) ;
   do i=1 to 36;
      ns[i]= input(substr(string,i,1),stupidorder.);
   end;
run;  


proc summary data=working nway;
   /* this ASSUMES all of the ID and Class value
      combinations are grouped in the data
      if that is not the case SORT the data first.
   */
   by id class notsorted;
   var ns: ;
   output out=summary (drop=_:) max=;
run;

data want;
   set summary;
   array ns (*) ns:;
   length string $ 36;
   do i=1 to dim(ns);
      string=cats(string,put(ns[i],stupidorder. -L));
   end;
   drop i ns: ;
run;

&lt;/PRE&gt;
&lt;P&gt;For future questions please provide in the form of a data step such as shown. Or at least paste stuff into a text box opened on the forum with the &amp;lt;/&amp;gt; icon above the main message window. Whatever created that "table" of values is truly obnoxious to work with and any larger sample of data I would have either skipped this question entirely or reduced it just a couple of values.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Paste code into a text box as well. That will preserve formatting, if you use such, and makes reading code much easier as well as separating code out from discussion text.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;An approach using RETAIN (or LAG functions) would likely only work for at most two rows of data per ID/Class combination. When you have rules like "any of the rows has 1" keeping track of having a 1 for multiple rows at the same time as trying ti keep the 0 or 2 straight gets to be pretty obnoxious.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note: if 0 is a "no occurence" you might have used a missing, 0(not important) and 1 (important) coding scheme (and in 36 numeric variables) as there are quite a lot of things that can be told with numeric statistics from 0/1 coded variables.&lt;/P&gt;</description>
      <pubDate>Thu, 24 Aug 2023 23:44:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/890928#M352021</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2023-08-24T23:44:30Z</dc:date>
    </item>
    <item>
      <title>Re: Collapsing string values within by-groups</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/890931#M352023</link>
      <description>&lt;P&gt;And what if a position in the string&amp;nbsp; has a zero, a one, AND a two?&lt;/P&gt;</description>
      <pubDate>Fri, 25 Aug 2023 01:57:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/890931#M352023</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2023-08-25T01:57:23Z</dc:date>
    </item>
    <item>
      <title>Re: Collapsing string values within by-groups</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/890932#M352024</link>
      <description>&lt;P&gt;I think I am correctly interpreting your rule by saying:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;If a character in a string ever contains a 1, the final character should be a 1&lt;/LI&gt;
&lt;LI&gt;Else if that character ever contains a 2, the final character should be a 2&lt;/LI&gt;
&lt;LI&gt;Otherwise it is a zero.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If that is correct, and if your data are already grouped by ID/CLASS, then:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input ID  CLASS STRING :$36.;
datalines;
10002 973 000000000000011111000000000000111110
10002 934 010122121110000000000000000000000000
10002 934 020222111100000000000000000000000000
;

data want (drop=i);
  set have;
  by id class notsorted;
  array matrix{0:2,36} _temporary_;
  if first.class=1 then call missing(of matrix{*});

  /* If this is not a single obs, then update the 3-row matrix */
  if not (first.class=1 and last.class=1) then do i=1 to 36 ;
    matrix(input(char(string,i),1.),i)+1;
  end;

  if last.class;
  /* If this is not a single obs, then interpret the 3-row matrix*/
  if first.id=0 then do i=1 to length(string);
    if      matrix(1,i)&amp;gt;0 then substr(string,i,1)='1'; 
    else if matrix(2,i)&amp;gt;0 then substr(string,i,1)='2'; 
    else                       substr(string,i,1)='0';
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 25 Aug 2023 01:57:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/890932#M352024</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2023-08-25T01:57:48Z</dc:date>
    </item>
    <item>
      <title>Re: Collapsing string values within by-groups</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/890935#M352025</link>
      <description>&lt;P&gt;It is going to be much easier if you transpose that data into one observation per month.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data tall;
  set have ;
  do month=1 to length(string);
    char=char(string,month);
    if char ne '0' then output;
  end;
  drop string;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If you are worried about performance make it a view instead of physical copy of the data.&lt;/P&gt;
&lt;P&gt;Then collapse based on your rules.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql ;
 create table collapsed as
 select id, class 
      , month
      , case when (max(char='1')) then '1'
             when (max(char='2')) then '2'
             else '0'
        end as char
 from tall
 group by id, class, month
 order by id, class, month
 ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If you need to you can rebuild the string.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  do until(last.class);
    set collapsed ;
    by id class ;
    length string $36 ;
    if first.class then string=repeat('0',35);
    substr(string,month,1)=char;
  end;
  drop month char;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Results&lt;/P&gt;
&lt;PRE&gt;Obs     id      class                   string

 1     10002     934     010122111110000000000000000000000000
 2     10002     973     000000000000011111000000000000111110
&lt;/PRE&gt;</description>
      <pubDate>Fri, 25 Aug 2023 02:13:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/890935#M352025</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2023-08-25T02:13:25Z</dc:date>
    </item>
    <item>
      <title>Re: Collapsing string values within by-groups</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/891003#M352060</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/338970"&gt;@mh04&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The &lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lefunctionsref/n08a8qvo1xuzd1n189xdm0nay8h5.htm" target="_blank" rel="noopener"&gt;BOR function&lt;/A&gt; lends itself to implement your collapsing rule. Sadly, it cannot deal with 72-bit strings, so some disassembling and reassembling is needed, which makes the code longer:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want(drop=b: c i);
length c $72;
array b[0:2];
do until(last.class);
  set have;
  by id class notsorted;
  c=put(translate(string,'000110'x,'012'),$hex72.);
  do i=0 to 2;
    b[i]=bor(max(0,b[i]),input(substr(c,24*i+1),binary24.));
  end;
end;
c=put(b1,binary24.)||put(b2,binary24.)||put(b3,binary24.);
string=translate(input(c,$hex72.),'0121','00011011'x);
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 25 Aug 2023 14:19:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/891003#M352060</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2023-08-25T14:19:54Z</dc:date>
    </item>
    <item>
      <title>Re: Collapsing string values within by-groups</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/891311#M352113</link>
      <description>Thank you and sorry about my formatting!</description>
      <pubDate>Mon, 28 Aug 2023 12:53:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/891311#M352113</guid>
      <dc:creator>mh04</dc:creator>
      <dc:date>2023-08-28T12:53:03Z</dc:date>
    </item>
    <item>
      <title>Re: Collapsing string values within by-groups</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/891312#M352114</link>
      <description>Yes, your interpretation is correct. Thank you!</description>
      <pubDate>Mon, 28 Aug 2023 12:53:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/891312#M352114</guid>
      <dc:creator>mh04</dc:creator>
      <dc:date>2023-08-28T12:53:59Z</dc:date>
    </item>
    <item>
      <title>Re: Collapsing string values within by-groups</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/891313#M352115</link>
      <description>Thank you!</description>
      <pubDate>Mon, 28 Aug 2023 12:54:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/891313#M352115</guid>
      <dc:creator>mh04</dc:creator>
      <dc:date>2023-08-28T12:54:12Z</dc:date>
    </item>
    <item>
      <title>Re: Collapsing string values within by-groups</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/891315#M352117</link>
      <description>Thank you! This is a bit over my head but I look forward to testing it and learning new things.</description>
      <pubDate>Mon, 28 Aug 2023 12:59:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Collapsing-string-values-within-by-groups/m-p/891315#M352117</guid>
      <dc:creator>mh04</dc:creator>
      <dc:date>2023-08-28T12:59:09Z</dc:date>
    </item>
  </channel>
</rss>

