<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Efficient data cleaning and recoding in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360559#M84917</link>
    <description>&lt;P&gt;I have the following 7 variables in my dataset:&lt;/P&gt;&lt;P&gt;type_1 - type_7&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The observations for each one of these is supposed to be either "checked" or "unchecked"in the datset. The first thing I want to do is check that there is only 1 observation for the 7 variables that is "checked" and the rest are "unchecked." What is the most efficient way to do this?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The second thing I want to do is recode the 7 variables into 1 variable where the value is the label for variable that is "checked."&amp;nbsp; What is the most efficient way to do this?&lt;/P&gt;</description>
    <pubDate>Mon, 22 May 2017 21:15:42 GMT</pubDate>
    <dc:creator>Melk</dc:creator>
    <dc:date>2017-05-22T21:15:42Z</dc:date>
    <item>
      <title>Efficient data cleaning and recoding</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360559#M84917</link>
      <description>&lt;P&gt;I have the following 7 variables in my dataset:&lt;/P&gt;&lt;P&gt;type_1 - type_7&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The observations for each one of these is supposed to be either "checked" or "unchecked"in the datset. The first thing I want to do is check that there is only 1 observation for the 7 variables that is "checked" and the rest are "unchecked." What is the most efficient way to do this?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The second thing I want to do is recode the 7 variables into 1 variable where the value is the label for variable that is "checked."&amp;nbsp; What is the most efficient way to do this?&lt;/P&gt;</description>
      <pubDate>Mon, 22 May 2017 21:15:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360559#M84917</guid>
      <dc:creator>Melk</dc:creator>
      <dc:date>2017-05-22T21:15:42Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient data cleaning and recoding</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360560#M84918</link>
      <description>&lt;P&gt;Please help us help you.&amp;nbsp; How about showing what the data looks like before your process.&amp;nbsp; And what you want it to look like afterwards.&lt;/P&gt;</description>
      <pubDate>Mon, 22 May 2017 21:19:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360560#M84918</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2017-05-22T21:19:42Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient data cleaning and recoding</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360561#M84919</link>
      <description>&lt;P&gt;Are the current values "checked" and 'unchecked" or, if not, are the variables character or numeric, and what values represent checked and unchecked. Also, what do you want to do if more than one is checked?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Art, CEO, AnalystFinder.com&lt;/P&gt;</description>
      <pubDate>Mon, 22 May 2017 21:19:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360561#M84919</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2017-05-22T21:19:46Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient data cleaning and recoding</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360564#M84921</link>
      <description>&lt;P&gt;Yes sorry! Here is a look at the data:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;value type_1 0='Unchecked' 1='Checked';&lt;/P&gt;&lt;P&gt;label type_1 'Type of surgery is xxx';&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;EM&gt;same for type_2 - type_7 but different surgeries xxx.&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;type_1&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; type_2&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; type_3 &amp;nbsp;&amp;nbsp;&amp;nbsp; ....&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; type_7&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;unchecked&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; checked&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; unchecked&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; unchecked&lt;/P&gt;&lt;P&gt;unchecked&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;unchecked &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; unchecked&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp; checked&lt;/P&gt;&lt;P&gt;&lt;EM&gt;...&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;Would like it to look like:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;type&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;xxx (from type_2)&lt;/P&gt;&lt;P&gt;xxx (from type_7)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;if more than 1 is checked, I just want to be able to identify that and check the discrepancy with the investigator. &lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for your help!&lt;/P&gt;</description>
      <pubDate>Mon, 22 May 2017 21:33:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360564#M84921</guid>
      <dc:creator>Melk</dc:creator>
      <dc:date>2017-05-22T21:33:01Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient data cleaning and recoding</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360568#M84922</link>
      <description>&lt;PRE&gt;data have;
  informat type_1 - type_7 $9.;
  label type_1='type of surgery is aaa';
  label type_2='type of surgery is bbb';
  label type_3='type of surgery is ccc';
  label type_4='type of surgery is ddd';
  label type_5='type of surgery is eee';
  label type_6='type of surgery is fff';
  label type_7='type of surgery is ggg';
  input type_1 - type_7;
  cards;
checked  unchecked  unchecked  unchecked  unchecked  unchecked  unchecked
unchecked  unchecked  unchecked  checked  unchecked  unchecked  unchecked
checked  unchecked  checked  unchecked  unchecked  unchecked  unchecked
;

data want multiples;
  set have;
  array vars(*) $ type_1-type_7;
  if count(catt(of vars(*)), 'un') lt 6 then output multiples;
  else do;
    type=substr(vlabel(vars(whichc('checked',of vars(*)))),20);
    output want;
  end;
run;
&lt;/PRE&gt;
&lt;P&gt;Art, CEO, AnalystFinder.com&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 22 May 2017 21:49:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360568#M84922</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2017-05-22T21:49:05Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient data cleaning and recoding</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360575#M84924</link>
      <description>&lt;P&gt;OK. I hadn't correctly read you're specs. How about:&lt;/P&gt;
&lt;PRE&gt;data have;
  informat type_1 - type_7 $9.;
  label type_1='type of surgery is aaa';
  label type_2='type of surgery is bbb';
  label type_3='type of surgery is ccc';
  label type_4='type of surgery is ddd';
  label type_5='type of surgery is eee';
  label type_6='type of surgery is fff';
  label type_7='type of surgery is ggg';
  input type_1 - type_7;
  cards;
1 0 0 0 0 0 0
0 0 0 0 0 1 1
0 0 1 0 0 0 0
0 0 0 0 0 0 0
;

data want multiples missing;
  set have;
  array vars(*) type_1-type_7;
  recnum=_n_;
  if sum(of vars(*)) eq 0 then output missing;
  else if sum(of vars(*)) gt 1 then output multiples;
  else do;
    type=substr(vlabel(vars(whichn(1,of vars(*)))),20);
    output want;
  end;
run;
&lt;/PRE&gt;
&lt;P&gt;Art, CEO, AnalystFinder.com&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 22 May 2017 22:10:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360575#M84924</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2017-05-22T22:10:14Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient data cleaning and recoding</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360576#M84925</link>
      <description>&lt;P&gt;Minor difference on &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13711"&gt;@art297&lt;/a&gt;'s approach basically to turn the checked/unchecked into numerics AND report as read that there may be a problem. Then clean it before&lt;/P&gt;
&lt;PRE&gt;proc format library=work;
invalue Checked
'checked'  = 1
'unchecked'= 0
other=.;
run;

data have;
  informat type_1 - type_7 Checked.;
  label type_1='type of surgery is aaa';
  label type_2='type of surgery is bbb';
  label type_3='type of surgery is ccc';
  label type_4='type of surgery is ddd';
  label type_5='type of surgery is eee';
  label type_6='type of surgery is fff';
  label type_7='type of surgery is ggg';
  input type_1 - type_7;
  TypeSum = sum(of Type_:);
  if typesum ne 1 then put "Problem with number checked for Type in record: " _n_;
 cards;
checked  unchecked  unchecked  unchecked  unchecked  unchecked  unchecked
unchecked  unchecked  unchecked  checked  unchecked  unchecked  unchecked
checked  unchecked  checked  unchecked  unchecked  unchecked  unchecked
;
run;&lt;/PRE&gt;</description>
      <pubDate>Mon, 22 May 2017 22:11:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360576#M84925</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2017-05-22T22:11:53Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient data cleaning and recoding</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360578#M84926</link>
      <description>&lt;P&gt;BTW: Interestinglly, the exact same code works if the variables are numeric 1s and 0s, rather than character "1"s and "0"s:&lt;/P&gt;
&lt;PRE&gt;data have;
  label type_1='type of surgery is aaa';
  label type_2='type of surgery is bbb';
  label type_3='type of surgery is ccc';
  label type_4='type of surgery is ddd';
  label type_5='type of surgery is eee';
  label type_6='type of surgery is fff';
  label type_7='type of surgery is ggg';
  input type_1 - type_7;
  cards;
1 0 0 0 0 0 0
0 0 0 0 0 1 1
0 0 1 0 0 0 0
0 0 0 0 0 0 0
;

data want multiples missing;
  set have;
  array vars(*) type_1-type_7;
  recnum=_n_;
  if sum(of vars(*)) eq 0 then output missing;
  else if sum(of vars(*)) gt 1 then output multiples;
  else do;
    type=substr(vlabel(vars(whichn(1,of vars(*)))),20);
    output want;
  end;
run;
&lt;/PRE&gt;
&lt;P&gt;Art, CEO, AnalystFinder.com&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 22 May 2017 22:27:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360578#M84926</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2017-05-22T22:27:27Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient data cleaning and recoding</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360581#M84927</link>
      <description>&lt;P&gt;Thank you! Worked beautifully.&lt;/P&gt;</description>
      <pubDate>Mon, 22 May 2017 23:02:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-data-cleaning-and-recoding/m-p/360581#M84927</guid>
      <dc:creator>Melk</dc:creator>
      <dc:date>2017-05-22T23:02:58Z</dc:date>
    </item>
  </channel>
</rss>

