<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Data cleaning based on rules: time period duration and time spans in between tied to certain dat in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Data-cleaning-based-on-rules-time-period-duration-and-time-spans/m-p/467703#M119420</link>
    <description>&lt;P&gt;I think there may be some concern about the entire logic.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For instance here:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;if sum(end1-beg1)&amp;lt;124 or sum(end2-beg2)&amp;lt;124 or sum(end3-beg3)&amp;lt;124 or sum(end4-beg4)&amp;lt;124 or sum(end5-beg5)&amp;lt;124
or sum(end6-beg6)&amp;lt;124 or sum(end7-beg7)&amp;lt;124 or sum(end8-beg8)&amp;lt;124 or sum(end9-beg9)&amp;lt;124 or sum(end10-beg10)&amp;lt;124
or sum(end11-beg11)&amp;lt;124 or sum(end12-beg12)&amp;lt;124 or sum(end13-beg13)&amp;lt;124 then remove=1;
if sum(end1-beg1)&amp;gt;=124 or sum(end2-beg2)&amp;gt;=124 or sum(end3-beg3)&amp;gt;=124 or sum(end4-beg4)&amp;gt;=124 or sum(end5-beg5)&amp;gt;=124
or sum(end6-beg6)&amp;gt;=124 or sum(end7-beg7)&amp;gt;=124 or sum(end8-beg8)&amp;gt;=124 or sum(end9-beg9)&amp;gt;=124 or sum(end10-beg10)&amp;gt;=124
or sum(end11-beg11)&amp;gt;=124 or sum(end12-beg12)&amp;gt;=124 or sum(end13-beg13)&amp;gt;=124 then remove=0;
if remove=0 then output; &lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If sum(end1-beg1)&amp;lt;124&amp;nbsp; and sum(end11-beg11)&amp;gt;=124&amp;nbsp;are both true then what should remove actually be?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And what the heck is SUM doing in there at all? It is only adding characters. If the values are dates then If (end1-beg1)&amp;lt;124 is sufficient.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It is also time to learn to use arrays.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also you should be aware that every record where &lt;SPAN class="token function"&gt;sum&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;end1&lt;SPAN class="token operator"&gt;-&lt;/SPAN&gt;beg1&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token operator"&gt;&amp;lt;31 is also less than 124 and every record &lt;SPAN class="token function"&gt;sum&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;end1-beg1&lt;SPAN class="token punctuation"&gt;)&amp;gt;=124 is also &amp;gt;=31. So assignment to a single "remove" variable is very likely not behaving as intended. So as a minimum you may want to consider a 31 day variable and a 124 day variable separately.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="token operator"&gt;&lt;SPAN class="token punctuation"&gt;A better description of what the inclusion exclusion rule is may help. I'm not going to look a the entire data example to try to figure out rules. &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 06 Jun 2018 22:01:20 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2018-06-06T22:01:20Z</dc:date>
    <item>
      <title>Data cleaning based on rules: time period duration and time spans in between tied to certain dates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-cleaning-based-on-rules-time-period-duration-and-time-spans/m-p/467227#M119273</link>
      <description>&lt;P&gt;I'm working with insurance claim data where patients treatment information can be affected by insurance enrollments. Losing insurance but somehow managing to get it back is common, needless to say. If i use data as is, I will have problems such as: patients that appear no treatment received were in actuality happened due to loss of insurance thus not in my claim data. Many other rules can be thought of but I'd like to stick to below few major rules and clean the data accordingly. Please help write SAS code reflecting these rules to a final clean data where my patients treatment information could be sought with minimal influence of insurance status. Perfect doesn't exist. But would like to weed out those observations looking obviously unreliable. &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13711"&gt;@art297&lt;/a&gt; &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/16961"&gt;@ChrisNZ&lt;/a&gt; &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/462"&gt;@PGStats&lt;/a&gt; &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/18408"&gt;@Ksharp&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Eligibility rules:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;- Patients must have insurance =&amp;gt; 4 months on a continuous basis where times in between enrollments are up to 30 days but not greater (John had 4 months of continuous insurance enrollment but lost it at the end of 4th month but got his insurance back after 29 days. And similar scenarios were repeated. I would keep John's observation in my data even if he lost his insurance (continued longer then 4 months many times) but always managed to get insurance back within less than a month every time he lose his insurance).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;- Insurance start date must be tied to diagnosis date. Aka, patients must have insurance started within one month since date of diagnosis&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;- With the exception when:&lt;/P&gt;
&lt;P&gt;Above conditions is not met because insurance was ended due to death&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Data attached to this post.&lt;/P&gt;
&lt;P&gt;where variables are:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;'begn:' date for beginning of insurance;&lt;/P&gt;
&lt;P&gt;'endn:' date for end of insurance&lt;/P&gt;
&lt;P&gt;alive: (1)-alive, (0)-death&lt;/P&gt;
&lt;P&gt;date of death and date of diagnosis are self explanatory.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc import datafile="...\insurance.csv" out=insurance
dbms=csv replace;
getnames=yes;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;My attempt which failed to tie the insurance dates to 'date of diagnosis' and 'date of death' and whether time in between two insurances greater or less than a month.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;/*whether patients insurance lasted only a month or less*/
data insurance1; set insurance; 
if sum(end1-beg1)&amp;lt;31 and sum(end2-beg2)&amp;lt;31 and sum(end3-beg3)&amp;lt;31 and sum(end4-beg4)&amp;lt;31 and sum(end5-beg5)&amp;lt;31
and sum(end6-beg6)&amp;lt;31 and sum(end7-beg7)&amp;lt;31 and sum(end8-beg8)&amp;lt;31 and sum(end9-beg9)&amp;lt;31 and sum(end10-beg10)&amp;lt;31
and sum(end11-beg11)&amp;lt;31 and sum(end12-beg12)&amp;lt;31 and sum(end13-beg13)&amp;lt;31 then remove=1;
if sum(end1-beg1)&amp;gt;=31 and sum(end2-beg2)&amp;gt;=31 and sum(end3-beg3)&amp;gt;=31 and sum(end4-beg4)&amp;gt;=31 and sum(end5-beg5)&amp;gt;=31
and sum(end6-beg6)&amp;gt;=31 and sum(end7-beg7)&amp;gt;=31 and sum(end8-beg8)&amp;gt;=31 and sum(end9-beg9)&amp;gt;=31 and sum(end10-beg10)&amp;gt;=31
and sum(end11-beg11)&amp;gt;=31 and sum(end12-beg12)&amp;gt;=31 and sum(end13-beg13)&amp;gt;=31 then remove=0;

/*whether patients insurance continued =&amp;gt; 4 months**/
if sum(end1-beg1)&amp;lt;124 or sum(end2-beg2)&amp;lt;124 or sum(end3-beg3)&amp;lt;124 or sum(end4-beg4)&amp;lt;124 or sum(end5-beg5)&amp;lt;124
or sum(end6-beg6)&amp;lt;124 or sum(end7-beg7)&amp;lt;124 or sum(end8-beg8)&amp;lt;124 or sum(end9-beg9)&amp;lt;124 or sum(end10-beg10)&amp;lt;124
or sum(end11-beg11)&amp;lt;124 or sum(end12-beg12)&amp;lt;124 or sum(end13-beg13)&amp;lt;124 then remove=1;
if sum(end1-beg1)&amp;gt;=124 or sum(end2-beg2)&amp;gt;=124 or sum(end3-beg3)&amp;gt;=124 or sum(end4-beg4)&amp;gt;=124 or sum(end5-beg5)&amp;gt;=124
or sum(end6-beg6)&amp;gt;=124 or sum(end7-beg7)&amp;gt;=124 or sum(end8-beg8)&amp;gt;=124 or sum(end9-beg9)&amp;gt;=124 or sum(end10-beg10)&amp;gt;=124
or sum(end11-beg11)&amp;gt;=124 or sum(end12-beg12)&amp;gt;=124 or sum(end13-beg13)&amp;gt;=124 then remove=0;
if remove=0 then output; &lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 04 Jun 2018 14:58:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-cleaning-based-on-rules-time-period-duration-and-time-spans/m-p/467227#M119273</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2018-06-04T14:58:05Z</dc:date>
    </item>
    <item>
      <title>Re: Data cleaning based on rules: time period duration and time spans in between tied to certain dat</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-cleaning-based-on-rules-time-period-duration-and-time-spans/m-p/467558#M119353</link>
      <description>What do you want to happen   if sum(end1-beg1)&amp;lt;31  and sum(end2-beg2)&amp;gt;=31   ?

You don't test for this scenario at the moment.</description>
      <pubDate>Tue, 05 Jun 2018 02:36:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-cleaning-based-on-rules-time-period-duration-and-time-spans/m-p/467558#M119353</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2018-06-05T02:36:23Z</dc:date>
    </item>
    <item>
      <title>Re: Data cleaning based on rules: time period duration and time spans in between tied to certain dat</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-cleaning-based-on-rules-time-period-duration-and-time-spans/m-p/467662#M119404</link>
      <description>Good catch indeed. I will keep the case but will check whether time period between insurance is not longer than a month. Can you please help me have my rules reflected in the code? I'll appreciate that so much.&lt;BR /&gt;</description>
      <pubDate>Tue, 05 Jun 2018 13:55:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-cleaning-based-on-rules-time-period-duration-and-time-spans/m-p/467662#M119404</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2018-06-05T13:55:15Z</dc:date>
    </item>
    <item>
      <title>Re: Data cleaning based on rules: time period duration and time spans in between tied to certain dat</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-cleaning-based-on-rules-time-period-duration-and-time-spans/m-p/467703#M119420</link>
      <description>&lt;P&gt;I think there may be some concern about the entire logic.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For instance here:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;if sum(end1-beg1)&amp;lt;124 or sum(end2-beg2)&amp;lt;124 or sum(end3-beg3)&amp;lt;124 or sum(end4-beg4)&amp;lt;124 or sum(end5-beg5)&amp;lt;124
or sum(end6-beg6)&amp;lt;124 or sum(end7-beg7)&amp;lt;124 or sum(end8-beg8)&amp;lt;124 or sum(end9-beg9)&amp;lt;124 or sum(end10-beg10)&amp;lt;124
or sum(end11-beg11)&amp;lt;124 or sum(end12-beg12)&amp;lt;124 or sum(end13-beg13)&amp;lt;124 then remove=1;
if sum(end1-beg1)&amp;gt;=124 or sum(end2-beg2)&amp;gt;=124 or sum(end3-beg3)&amp;gt;=124 or sum(end4-beg4)&amp;gt;=124 or sum(end5-beg5)&amp;gt;=124
or sum(end6-beg6)&amp;gt;=124 or sum(end7-beg7)&amp;gt;=124 or sum(end8-beg8)&amp;gt;=124 or sum(end9-beg9)&amp;gt;=124 or sum(end10-beg10)&amp;gt;=124
or sum(end11-beg11)&amp;gt;=124 or sum(end12-beg12)&amp;gt;=124 or sum(end13-beg13)&amp;gt;=124 then remove=0;
if remove=0 then output; &lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If sum(end1-beg1)&amp;lt;124&amp;nbsp; and sum(end11-beg11)&amp;gt;=124&amp;nbsp;are both true then what should remove actually be?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And what the heck is SUM doing in there at all? It is only adding characters. If the values are dates then If (end1-beg1)&amp;lt;124 is sufficient.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It is also time to learn to use arrays.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also you should be aware that every record where &lt;SPAN class="token function"&gt;sum&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;end1&lt;SPAN class="token operator"&gt;-&lt;/SPAN&gt;beg1&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token operator"&gt;&amp;lt;31 is also less than 124 and every record &lt;SPAN class="token function"&gt;sum&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;end1-beg1&lt;SPAN class="token punctuation"&gt;)&amp;gt;=124 is also &amp;gt;=31. So assignment to a single "remove" variable is very likely not behaving as intended. So as a minimum you may want to consider a 31 day variable and a 124 day variable separately.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="token operator"&gt;&lt;SPAN class="token punctuation"&gt;A better description of what the inclusion exclusion rule is may help. I'm not going to look a the entire data example to try to figure out rules. &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 06 Jun 2018 22:01:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-cleaning-based-on-rules-time-period-duration-and-time-spans/m-p/467703#M119420</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2018-06-06T22:01:20Z</dc:date>
    </item>
  </channel>
</rss>

