<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Advice on data cleaning in SAS Data Management</title>
    <link>https://communities.sas.com/t5/SAS-Data-Management/Advice-on-data-cleaning/m-p/307634#M8799</link>
    <description>&lt;P&gt;If I thought keeping track was absolutely necessary I would ensure that my original data has a unique identifier for each record.&lt;/P&gt;
&lt;P&gt;Then after I had checked/cleaned data I would have a data set of the identifiers checked.&lt;/P&gt;
&lt;P&gt;The "next time" I cleaned the data I would subset the data to those records whose idendifiers were not in the data set of the already checked. Then update the identifier set with those checked. Repeat as needed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But there are a number of other issues involved I don't go into without getting paid...&lt;/P&gt;</description>
    <pubDate>Thu, 27 Oct 2016 13:50:13 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2016-10-27T13:50:13Z</dc:date>
    <item>
      <title>Advice on data cleaning</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Advice-on-data-cleaning/m-p/307596#M8797</link>
      <description>&lt;P&gt;Hi everyone.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can I please seek your advise on how to write the SAS program to clean my survey based data in a way that can be used to perform routine data check because the data collection is till going. That is whatever that has previously been checked and clarified as logical/plausible/truly unavailable answer will not show up again when I run the data cleaning program next time.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you&amp;nbsp;very much&amp;nbsp;&lt;img id="smileyhappy" class="emoticon emoticon-smileyhappy" src="https://communities.sas.com/i/smilies/16x16_smiley-happy.png" alt="Smiley Happy" title="Smiley Happy" /&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 27 Oct 2016 09:22:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Advice-on-data-cleaning/m-p/307596#M8797</guid>
      <dc:creator>Miracle</dc:creator>
      <dc:date>2016-10-27T09:22:14Z</dc:date>
    </item>
    <item>
      <title>Re: Advice on data cleaning</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Advice-on-data-cleaning/m-p/307598#M8798</link>
      <description>&lt;P&gt;This isn't really a Q&amp;amp;A secnario here. &amp;nbsp;If I was asked to this then I would probably look at something like this:&lt;/P&gt;
&lt;P&gt;Say you have data:&lt;/P&gt;
&lt;P&gt;SUBJ &amp;nbsp; Q1 &amp;nbsp; Q2 &amp;nbsp; Q3 &amp;nbsp; Q4...&lt;/P&gt;
&lt;P&gt;Now you need to keep a record on each obs, the above structure isn't good for that. &amp;nbsp;So step one is to have a normalised dataset:&lt;/P&gt;
&lt;P&gt;SUBJ &amp;nbsp; &amp;nbsp;QNUM &amp;nbsp; &amp;nbsp;RESULT&lt;/P&gt;
&lt;P&gt;... &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;...&lt;/P&gt;
&lt;P&gt;... &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 2 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;...&lt;/P&gt;
&lt;P&gt;...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Why does this change matter so much, well, you can simply add additional data to each observation this way, say you want a flag for locked, a date for last checked, and outstanding qeury coded item:&lt;/P&gt;
&lt;P&gt;SUBJ &amp;nbsp; &amp;nbsp;QNUM &amp;nbsp; &amp;nbsp;RESULT &amp;nbsp; LOCKED &amp;nbsp; LAST_DATE &amp;nbsp; TERM&lt;/P&gt;
&lt;P&gt;... &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;... &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;N &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 12DEC2015 &amp;nbsp; &amp;nbsp;Result_Missing&lt;/P&gt;
&lt;P&gt;... &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 2 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;... &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Y &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;14JAN2016&lt;/P&gt;
&lt;P&gt;...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The main thing will be how to know when to update things, say you have cleaned a data item, and consider it locked, if the data next transfer comes in and has changed...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Personally, I would run your suite of checks on the whole data at each timepoint, and just compare that to a list of outstanding items. &amp;nbsp;Pretty simple, but a manual.&lt;/P&gt;
&lt;P&gt;...&lt;/P&gt;</description>
      <pubDate>Thu, 27 Oct 2016 09:29:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Advice-on-data-cleaning/m-p/307598#M8798</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2016-10-27T09:29:47Z</dc:date>
    </item>
    <item>
      <title>Re: Advice on data cleaning</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Advice-on-data-cleaning/m-p/307634#M8799</link>
      <description>&lt;P&gt;If I thought keeping track was absolutely necessary I would ensure that my original data has a unique identifier for each record.&lt;/P&gt;
&lt;P&gt;Then after I had checked/cleaned data I would have a data set of the identifiers checked.&lt;/P&gt;
&lt;P&gt;The "next time" I cleaned the data I would subset the data to those records whose idendifiers were not in the data set of the already checked. Then update the identifier set with those checked. Repeat as needed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But there are a number of other issues involved I don't go into without getting paid...&lt;/P&gt;</description>
      <pubDate>Thu, 27 Oct 2016 13:50:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Advice-on-data-cleaning/m-p/307634#M8799</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2016-10-27T13:50:13Z</dc:date>
    </item>
  </channel>
</rss>

