<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Filling In Values After Merging Two Datasets in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Filling-In-Values-After-Merging-Two-Datasets/m-p/640615#M8232</link>
    <description>&lt;P&gt;What type of database you have - is it an excel worksheet or sas data-set?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Please post sample of your data with variable names - few lines and few months only.&lt;/P&gt;</description>
    <pubDate>Fri, 17 Apr 2020 06:39:23 GMT</pubDate>
    <dc:creator>Shmuel</dc:creator>
    <dc:date>2020-04-17T06:39:23Z</dc:date>
    <item>
      <title>Filling In Values After Merging Two Datasets</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Filling-In-Values-After-Merging-Two-Datasets/m-p/640614#M8231</link>
      <description>&lt;P&gt;Dear All,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am working on a large confidential healthcare data set. I am trying to make cost estimation based on it. In my dataset, there are cancer patients with survival info and their monthly cost. I will subtract patients' own pre-diagnosis monthly costs from post-diagnosis monthly cost to estimate cancer related cost I have postdiagnosis cost of patients until death (therefore number of months covered after diagnosis is different for each patients) but I have 12 month prediagnosis cost for each patients. I am planning to make subtraction based on the following logic:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Let us think about only one patient:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. If this patient survives less than or equal to 12 months, I will subtract cost in each prediagnosis month from each postdiagnosis month:&lt;/P&gt;&lt;P&gt;cancer cost(i)=postdiagnosiscost(i)-prediagnosis cost(i), i=1,...,S S=number of months survived.&lt;/P&gt;&lt;P&gt;2. If patient survives more than 12 months, I will subtract&amp;nbsp;cost in each prediagnosis month from each postdiagnosis month according to mod 12:&lt;/P&gt;&lt;P&gt;cancer cost(i)=postdiagnosis cost(i)-prediagnosis cost(j), i=1,..,S j=mod(i),&amp;nbsp; S=number of months survived.&lt;/P&gt;&lt;P&gt;Let the patient live 26 months:&lt;/P&gt;&lt;P&gt;For cancer cost at 15th month I will have:&lt;/P&gt;&lt;P&gt;cancercost(15)=postdiagnosis cost(15)-prediagnosis cost(3),&amp;nbsp;&lt;/P&gt;&lt;P&gt;For cancer cost at 25th month I will have:&lt;/P&gt;&lt;P&gt;cancercost(25)=postdiagnosis cost(15)-prediagnosis cost(1)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I merged these two datasets (prediagnosis and postdiagnosis cost) but I have missing values because I have only 12 months of prediagnosis cost and postdiagnosis cost for more than 12 months. I want to fill in missing values of prediagnosis cost by replicating them according to mod of survival of patients.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: inherit;"&gt;Can you help me in writing macro or code of this problem?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;</description>
      <pubDate>Fri, 17 Apr 2020 06:21:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Filling-In-Values-After-Merging-Two-Datasets/m-p/640614#M8231</guid>
      <dc:creator>Kemal_G</dc:creator>
      <dc:date>2020-04-17T06:21:56Z</dc:date>
    </item>
    <item>
      <title>Re: Filling In Values After Merging Two Datasets</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Filling-In-Values-After-Merging-Two-Datasets/m-p/640615#M8232</link>
      <description>&lt;P&gt;What type of database you have - is it an excel worksheet or sas data-set?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Please post sample of your data with variable names - few lines and few months only.&lt;/P&gt;</description>
      <pubDate>Fri, 17 Apr 2020 06:39:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Filling-In-Values-After-Merging-Two-Datasets/m-p/640615#M8232</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2020-04-17T06:39:23Z</dc:date>
    </item>
    <item>
      <title>Re: Filling In Values After Merging Two Datasets</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Filling-In-Values-After-Merging-Two-Datasets/m-p/640780#M8236</link>
      <description>&lt;P&gt;This may give a place to get started. I am only using 4 pre/post values so it is easy to see.&lt;/P&gt;
&lt;PRE&gt;data example;
   input Pre_1 - Pre_4 Post_1-Post_4;
   array pre Pre_: ;
   array post Post_:;
   array cost_(4);
   do i = 1 to ( n(of post(*)) );
      cost_[i] = post[i] - pre[i];
   end;
datalines;
 10 20 30 40 20 35 46 70
 10 20 30 40 40 20 . .
 ;
 run;&lt;/PRE&gt;
&lt;P&gt;the N(of post(*)) gets the number of elements not missing in the Post array. Assumption is that these are&amp;nbsp; sequential for post month1 to n with no gaps.&lt;/P&gt;
&lt;P&gt;You can use basically any function that returns a non-0 positive integer for a simple array index.&lt;/P&gt;
&lt;P&gt;So&amp;nbsp;&amp;nbsp; post[Mod(i,4)] would attempt to use the mod 4 of the loop counter as the index. Problem: mod(4,4) is 0 and would cause problems so you would need to check before actually using it with the array.&lt;/P&gt;
&lt;P&gt;I use the [] for array index to make it easier to see the difference between the function calls and index definition boundary. Array references could use either the () or [].&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;With the different break points you would probably do an initial test of the n(of post(*)) and branch accordingly for your boundaries.&lt;/P&gt;</description>
      <pubDate>Fri, 17 Apr 2020 16:57:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Filling-In-Values-After-Merging-Two-Datasets/m-p/640780#M8236</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2020-04-17T16:57:57Z</dc:date>
    </item>
  </channel>
</rss>

