<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Compile records based on rules in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Compile-records-based-on-rules/m-p/841460#M332736</link>
    <description>&lt;P&gt;So first get rid of the observations with all missing codes.&amp;nbsp; That is easier with a data step.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data not_missing;
  set have;
  if n(of Type_CD:);
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Then eliminate those with conflicting STATE codes.&amp;nbsp; That is easier in PROC SQL.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table no_dups as
select * from not_missing
group by dummyid
having count(distinct state)=1
;
quit;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Then it looks like you just want to do an UPDATE to collapse the non-missing codes.&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  update no_dups(obs=0) no_dups;
  by dummyid ;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;SPAN&gt;Note that the last non-missing value will be retained by this method.&amp;nbsp; So if the order matters you should have an ordering variable to make sure the de-duped data records are processed in the right order.&amp;nbsp; Perhaps there is DATE or other variable you can use to order?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Implementing the any value of 1 overrides the other values rule will be harder.&amp;nbsp; You could make a set of observations that just have the DUMMYID and a 1 for the particular TYPE code variable and then use that to update the previous results.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data any_one;
  set no_dups(rename=(TYPE_CD1-TYPE_CD6=x1-x6));
  array type_cd [6];
  array x[6];
  do index=1 to 6;
    if x[index]=1 then do;
      type_cd[index]=1;
      output;
      type_cd[index]=.;
    end;
  end;
  keep dummyid type_cd: ;
run;

data want;
  update want any_one;
  by dummyid;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sat, 29 Oct 2022 16:29:59 GMT</pubDate>
    <dc:creator>Tom</dc:creator>
    <dc:date>2022-10-29T16:29:59Z</dc:date>
    <item>
      <title>Compile records based on rules</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Compile-records-based-on-rules/m-p/841412#M332706</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;
&lt;P&gt;I wanted to compile records into the target data set "want" based on some simple rules for the data I have "have" as below. The simple rules include&lt;/P&gt;
&lt;P&gt;(1) When multiple records per person have the same state info and the complementary Type_CDs, compile them into one record, for example, person1, person2, person 3, and person4.&lt;/P&gt;
&lt;P&gt;(2)&amp;nbsp;When multiple records per person have the different state info and their Type_CDs are different, the records cannot be compiled, for example, person5 and person6. However, if one record has no Type_CDs available at all, then the record with any available Type_CDs should be picked, and Person8 is this case.&lt;/P&gt;
&lt;P&gt;(3) Any other Type_CD is subset of Type_CD&amp;nbsp; 1. Because of this, any record with Type_CD=1 can overwrite other records per person if the records have the same state info. for example, Person7.&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
infile datalines truncover dsd;
input DummyID $ State $ Type_CD1-Type_CD6
;
datalines;
Person1,NJ,,,,15,1,1
Person1,NJ,15,15,15,,,
Person2,NY,7,7,7,7,7,
Person2,NY,,,,,,7
Person3,NY,,,,,,7
Person3,NY,,,,7,7,
Person4,NY,,,,7,7,7
Person4,NY,7,7,7,,,
Person5,NY,7,7,7,7,7,7
Person5,CA,,,,,,4
Person6,NY,7,7,7,7,7,
Person6,NJ,,,,,,15
Person7,AR,1,1,1,1,1,1
Person7,AR,,,7,7,7,7
Person8,NY,,,,1,1,1
Person8,NJ,,,,,,
;
run;

data want;
infile datalines truncover dsd;
input DummyID $ State $ Type_CD1-Type_CD6
;
datalines;
Person1,NJ,15,15,15,15,1,1
Person2,NY,7,7,7,7,7,7
Person3,NY,,,,7,7,7
Person4,NY,7,7,7,7,7,7
Person7,AR,1,1,1,1,1,1
Person8,NY,,,,1,1,1
;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any suggestions are greatly appreciated.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 29 Oct 2022 15:36:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Compile-records-based-on-rules/m-p/841412#M332706</guid>
      <dc:creator>lichee</dc:creator>
      <dc:date>2022-10-29T15:36:30Z</dc:date>
    </item>
    <item>
      <title>Re: Compile records based on rules</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Compile-records-based-on-rules/m-p/841460#M332736</link>
      <description>&lt;P&gt;So first get rid of the observations with all missing codes.&amp;nbsp; That is easier with a data step.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data not_missing;
  set have;
  if n(of Type_CD:);
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Then eliminate those with conflicting STATE codes.&amp;nbsp; That is easier in PROC SQL.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table no_dups as
select * from not_missing
group by dummyid
having count(distinct state)=1
;
quit;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Then it looks like you just want to do an UPDATE to collapse the non-missing codes.&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  update no_dups(obs=0) no_dups;
  by dummyid ;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;SPAN&gt;Note that the last non-missing value will be retained by this method.&amp;nbsp; So if the order matters you should have an ordering variable to make sure the de-duped data records are processed in the right order.&amp;nbsp; Perhaps there is DATE or other variable you can use to order?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Implementing the any value of 1 overrides the other values rule will be harder.&amp;nbsp; You could make a set of observations that just have the DUMMYID and a 1 for the particular TYPE code variable and then use that to update the previous results.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data any_one;
  set no_dups(rename=(TYPE_CD1-TYPE_CD6=x1-x6));
  array type_cd [6];
  array x[6];
  do index=1 to 6;
    if x[index]=1 then do;
      type_cd[index]=1;
      output;
      type_cd[index]=.;
    end;
  end;
  keep dummyid type_cd: ;
run;

data want;
  update want any_one;
  by dummyid;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 29 Oct 2022 16:29:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Compile-records-based-on-rules/m-p/841460#M332736</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2022-10-29T16:29:59Z</dc:date>
    </item>
    <item>
      <title>Re: Compile records based on rules</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Compile-records-based-on-rules/m-p/841475#M332739</link>
      <description>&lt;P&gt;Here is a method using PROC SUMMARY that works for your example.&amp;nbsp; But I worry it might not do what you want with more complex data.&amp;nbsp; In particular it might not do the duplicate state issue right if there are more than two observations per ID.&amp;nbsp; Also if the any of the type codes are zero then using the minimum to test if the code is 1 will not work.&amp;nbsp; And finally (again) it is not clear what criteria you want to use to pick between conflicting types codes other than 1.&amp;nbsp; This method takes the MAX of the other codes.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc summary nway 
  data=have(where=(n(type_cd1,type_cd2,type_cd3,type_cd4,type_cd5,type_cd6))
;
  class dummyid;
  var type: ;
  output out=summary idgroup(out[2] (state)=) min=min1-min6 max=max1-max6 ;
run;

data want;
  set summary;

* Eliminate duplicate states ;
  if state_1=state_2 or state_2=' ';

* Copy the state ;
  state=state_1;

* Select the calcualted TYPE code ;
* Select 1 if present otherwise use the maximum ;
  array low min1-min6;
  array high max1-max6;
  array type Type_CD1-Type_CD6;
  do index=1 to 6;
    if low[index]=1 then type[index]=1;
    else type[index]=high[index];
  end;
  keep dummyid state type:;
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 29 Oct 2022 17:03:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Compile-records-based-on-rules/m-p/841475#M332739</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2022-10-29T17:03:06Z</dc:date>
    </item>
  </channel>
</rss>

