<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Calculate weighted average across rows with conditions in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512841#M138154</link>
    <description>Good catch, I think the proc sql needs to have group by: Id and code. Because I was trying to deduplicate the data first by Id and code then estimate the weighted average per patient.&lt;BR /&gt;After dedup do: Min-diff*min-wgt and sum this for each patient and divide by sum of min-wgt for each patient.&lt;BR /&gt;Does that make sense?&lt;BR /&gt;Any alternative approach to estimate the error associated with the picks of the first high evidence diagnoses regardless of dod-diff? Please see the image for the best and worst scenario where dod-diff varies depending on when high evidence diagnosis happened in relation to dod-diff.&lt;BR /&gt;</description>
    <pubDate>Wed, 14 Nov 2018 05:33:42 GMT</pubDate>
    <dc:creator>Cruise</dc:creator>
    <dc:date>2018-11-14T05:33:42Z</dc:date>
    <item>
      <title>Calculate weighted average across rows with conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512796#M138139</link>
      <description>&lt;P&gt;I have two different datasets which represents patients who have been diagnosed with colon or rectum cancer (colorectal cancer).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The first data (attached to this post) has following four variables (date of diagnosis known thus interval variable DOD_DIFF was calculated):&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Date: ‘date of visit’&lt;/LI&gt;
&lt;LI&gt;DOD_DIFF: the difference between the ‘date of visit’ and the date of diagnosis confirmed, ranges 1 thru 30&lt;/LI&gt;
&lt;LI&gt;CODE: 12 distinct diagnosis code&lt;/LI&gt;
&lt;LI&gt;WGT_DX: arbitrary weight to indicate how close the diagnosis is to a true diagnosis, a colorectal cancer. For example, 153 takes value 1 because it is an exact code for a colon cancer. 154 takes value 1 because it is an exact code for a rectum cancer. 211 takes value 6 because it’s a code for a benign cancer thus a weaker evidence of a cancer.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The second data:&lt;/P&gt;
&lt;P&gt;Same variables but DOD_DIFF is not known due to missing in date of diagnosis. &amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My goal is to answer a question: if you pick the first diagnosis associated with the highest weight ‘WGT_DX’ how much error will you produce based on DOD_DIFF variable?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Therefore, I want to estimate this error using the first data with known information by picking the first diagnosis associated with the highest weight ‘WGT_DX’, before applying this approach to the data with unknown date of diagnosis.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To serve this purpose, I’d like to calculate weighted average score for each patient and assess the distribution of WEIGHTED_AVERAGE_SCORE for the entire population afterwards.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The logic used to calculate a weighted average error for each patient is following:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;MIN(DOD_DIFF) AS MIN_DIFF&lt;/P&gt;
&lt;P&gt;MIN(WGT_DX) AS MIN_WGT,&lt;/P&gt;
&lt;P&gt;MIN_DIFF*MIN_WGT AS DIFF_WEIGHT,&lt;/P&gt;
&lt;P&gt;SUM(DIFF_WEIGHT)/SUM(MIN_WGT) as WEIGHTED_AVERAGE_SCORE PER PATIENT&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I really hope this makes sense. Please let me know if not!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The SAS codes used is below which led me to an error:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE(KEEP=WGT_DX ID CODE DOD_DIFF SER_DATE_DX); SET HAVE; /*N_ROWS=142,598, N_DISTINCT_PATIENTS=6,375*/
IF CODE IN ('153') THEN WGT_DX=1; ELSE    
IF CODE IN ('154') THEN WGT_DX=1; ELSE   
IF CODE IN ('159') THEN WGT_DX=2; ELSE  
IF CODE IN ('197') THEN WGT_DX=3; ELSE  
IF CODE IN ('199') THEN WGT_DX=3; ELSE  
IF CODE IN ('209') THEN WGT_DX=5; ELSE  
IF CODE IN ('211') THEN WGT_DX=6; ELSE  
IF CODE IN ('230') THEN WGT_DX=2; ELSE  
IF CODE IN ('235') THEN WGT_DX=3; ELSE  
IF CODE IN ('239') THEN WGT_DX=3; ELSE  
IF CODE IN ('V10') THEN WGT_DX=4; ELSE  
IF CODE IN ('V76') THEN WGT_DX=4; 
RUN;

PROC SORT DATA=HAVE;
BY ID CODE SER_DATE_DX;
FORMAT SER_DATE_DX DATE9.; 
RUN; 

PROC SQL;
create table ERROR_FIRST_DIAGNOSIS as
select CODE, ID, MIN(DOD_DIFF) AS MIN_DIFF,
                 MIN(WGT_DX) AS MIN_WGT,
				 MIN_DIFF*MIN_WGT AS DIFF_WEIGHT,
   SUM(DIFF_WEIGHT)/SUM(MIN_WGT) as WEIGHTED_AVER
from HAVE
GROUP BY ID;
QUIT; 

PROC UNIVARIATE DATA=ERROR_FIRST_DIAGNOSIS;
VAR WEIGHTED_AVER;
HISTOGRAM;
RUN;  
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;817&amp;nbsp; PROC SQL;&lt;/P&gt;
&lt;P&gt;818&amp;nbsp; create table P.ERROR_FIRST_DIAGNOSIS as&lt;/P&gt;
&lt;P&gt;819&amp;nbsp; select CODE, ID, MIN(DOD_DIFF) AS MIN_DIFF,&lt;/P&gt;
&lt;P&gt;820&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; MIN(WGT_DX) AS MIN_WGT,&lt;/P&gt;
&lt;P&gt;821&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; MIN_DIFF*MIN_WGT AS DIFF_WEIGHT,&lt;/P&gt;
&lt;P&gt;822&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; SUM(DIFF_WEIGHT)/SUM(MIN_WGT) as WEIGHTED_AVER&lt;/P&gt;
&lt;P&gt;823&amp;nbsp; from P.HAVE&lt;/P&gt;
&lt;P&gt;824&amp;nbsp; GROUP BY ID;&lt;/P&gt;
&lt;P&gt;ERROR: The SUM summary function requires a numeric argument.&lt;/P&gt;
&lt;P&gt;ERROR: The SUM summary function requires a numeric argument.&lt;/P&gt;
&lt;P&gt;ERROR: The following columns were not found in the contributing tables: DIFF_WEIGHT, MIN_DIFF,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; MIN_WGT.&lt;/P&gt;
&lt;P&gt;825&amp;nbsp; QUIT;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="picking first diagnosis blindly can cause serious error or none. but the range or error need to be estimated using the date with known information" style="width: 600px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/24876i164B369593597AA2/image-size/large?v=v2&amp;amp;px=999" role="button" title="BEST_WORST SCENARIO.png" alt="picking first diagnosis blindly can cause serious error or none. but the range or error need to be estimated using the date with known information" /&gt;&lt;span class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;picking first diagnosis blindly can cause serious error or none. but the range or error need to be estimated using the date with known information&lt;/span&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Nov 2018 01:05:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512796#M138139</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2018-11-14T01:05:59Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate weighted average across rows with conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512798#M138140</link>
      <description>&lt;P&gt;This post was inspired by my previous post that you guys had kindly responded.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/138205"&gt;@novinosrin&lt;/a&gt;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/31461"&gt;@mkeintz&lt;/a&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Nov 2018 01:02:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512798#M138140</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2018-11-14T01:02:03Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate weighted average across rows with conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512802#M138141</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/132289"&gt;@Cruise&lt;/a&gt;&amp;nbsp; Good evening, I am afraid I am about to run to catch red line CTA to get back home(Chicago). I will certainly look into it tomorrow soon as I get back to my lab, but I am pretty sure somebody else will respond tonight. Have a good one!&lt;/P&gt;</description>
      <pubDate>Wed, 14 Nov 2018 01:15:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512802#M138141</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-11-14T01:15:30Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate weighted average across rows with conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512833#M138148</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/132289"&gt;@Cruise&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;To serve this purpose, I’d like to calculate weighted average score for each patient and assess the distribution of WEIGHTED_AVERAGE_SCORE for the entire population afterwards.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The logic used to calculate a weighted average error for each patient is following:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;MIN(DOD_DIFF) AS MIN_DIFF&lt;/P&gt;
&lt;P&gt;MIN(WGT_DX) AS MIN_WGT,&lt;/P&gt;
&lt;P&gt;MIN_DIFF*MIN_WGT AS DIFF_WEIGHT,&lt;/P&gt;
&lt;P&gt;SUM(DIFF_WEIGHT)/SUM(MIN_WGT) as WEIGHTED_AVERAGE_SCORE PER PATIENT&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I really hope this makes sense. Please let me know if not!&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm a little confused.&amp;nbsp; Is this correct?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp; MIN(DOD_DIFF) AS MIN_DIFF&amp;nbsp;&amp;nbsp; is a constant within a given ID&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp; MIN(WGT_DX) AS MIN_WGT&amp;nbsp;&amp;nbsp; is also a constant within a given ID&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp; MIN_DIFF*MIN_WGT AS DIFF_WEIGHT&amp;nbsp;&amp;nbsp; is therefore also a constant within a given ID&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So wouldn't&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; SUM(DIFF_WEIGHT)/SUM(MIN_WGT)&amp;nbsp;be same ratio as DIFF_WEIGHT/MIN_WGT?&lt;BR /&gt;which by definition would end up being&lt;BR /&gt;&amp;nbsp;&amp;nbsp; (MIN_DIFF*MIN_WGT)/MIN_WGT&amp;nbsp;&amp;nbsp;&amp;nbsp;= MIN_DIFF?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Nov 2018 04:37:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512833#M138148</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2018-11-14T04:37:17Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate weighted average across rows with conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512841#M138154</link>
      <description>Good catch, I think the proc sql needs to have group by: Id and code. Because I was trying to deduplicate the data first by Id and code then estimate the weighted average per patient.&lt;BR /&gt;After dedup do: Min-diff*min-wgt and sum this for each patient and divide by sum of min-wgt for each patient.&lt;BR /&gt;Does that make sense?&lt;BR /&gt;Any alternative approach to estimate the error associated with the picks of the first high evidence diagnoses regardless of dod-diff? Please see the image for the best and worst scenario where dod-diff varies depending on when high evidence diagnosis happened in relation to dod-diff.&lt;BR /&gt;</description>
      <pubDate>Wed, 14 Nov 2018 05:33:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512841#M138154</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2018-11-14T05:33:42Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate weighted average across rows with conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512978#M138198</link>
      <description>&lt;P&gt;Good morning&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/132289"&gt;@Cruise&lt;/a&gt;&amp;nbsp; Not familiar with the subject, so let's take the piecemeal approach and you will have to lead us&lt;/P&gt;
&lt;P&gt;I amended&amp;nbsp; Proc sql section in your code and tested for ID=1(just for one ID)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;Substitute group by id,code;&amp;nbsp; for&amp;nbsp;group by id; if you want to group by both making it composite&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;(where=(id=1))&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC SQL;
create table ERROR_FIRST_DIAGNOSIS as
select *, MIN(DOD_DIFF) AS MIN_DIFF,
                 MIN(WGT_DX) AS MIN_WGT,
				calculated MIN_DIFF* calculated MIN_WGT AS DIFF_WEIGHT
   /*SUM(DIFF_WEIGHT)/SUM(MIN_WGT) as WEIGHTED_AVER*/
from HAVE(where=(id=1))
GROUP BY ID;
QUIT; &lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Can you confirm the below one is what you intended&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;SAS Output&lt;/P&gt;
&lt;DIV class="branch"&gt;
&lt;TABLE class="systitleandfootercontainer" border="0" summary="Page Layout" width="100%" frame="void" rules="none" cellspacing="1" cellpadding="1"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="c systemtitle"&gt;The SAS System&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;BR /&gt;
&lt;DIV&gt;
&lt;DIV align="center"&gt;
&lt;TABLE class="table" summary="Procedure Print: Data Set WORK.ERROR_FIRST_DIAGNOSIS" frame="box" rules="all" cellspacing="0" cellpadding="5"&gt;&lt;COLGROUP&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt;&lt;/COLGROUP&gt;
&lt;THEAD&gt;
&lt;TR&gt;
&lt;TH class="r header" scope="col"&gt;SER_DATE_DX&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;DOD_DIFF&lt;/TH&gt;
&lt;TH class="l header" scope="col"&gt;CODE&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;ID&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;WGT_DX&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;MIN_DIFF&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;MIN_WGT&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;DIFF_WEIGHT&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17006&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="l data"&gt;197&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;3&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17028&lt;/TD&gt;
&lt;TD class="r data"&gt;21&lt;/TD&gt;
&lt;TD class="l data"&gt;197&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;3&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17032&lt;/TD&gt;
&lt;TD class="r data"&gt;25&lt;/TD&gt;
&lt;TD class="l data"&gt;153&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17031&lt;/TD&gt;
&lt;TD class="r data"&gt;24&lt;/TD&gt;
&lt;TD class="l data"&gt;153&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17028&lt;/TD&gt;
&lt;TD class="r data"&gt;21&lt;/TD&gt;
&lt;TD class="l data"&gt;153&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17006&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="l data"&gt;153&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17013&lt;/TD&gt;
&lt;TD class="r data"&gt;6&lt;/TD&gt;
&lt;TD class="l data"&gt;197&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;3&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17013&lt;/TD&gt;
&lt;TD class="r data"&gt;6&lt;/TD&gt;
&lt;TD class="l data"&gt;153&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17006&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="l data"&gt;197&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;3&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17028&lt;/TD&gt;
&lt;TD class="r data"&gt;21&lt;/TD&gt;
&lt;TD class="l data"&gt;197&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;3&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Wed, 14 Nov 2018 16:24:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512978#M138198</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-11-14T16:24:53Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate weighted average across rows with conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512979#M138199</link>
      <description>&lt;P&gt;Next one:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;/*WEIGHTED_AVER*/
PROC SQL;
create table ERROR_FIRST_DIAGNOSIS1 as
select *,SUM(DIFF_WEIGHT)/SUM(MIN_WGT) as WEIGHTED_AVER
from
(select *, MIN(DOD_DIFF) AS MIN_DIFF,
                 MIN(WGT_DX) AS MIN_WGT,
				calculated MIN_DIFF* calculated MIN_WGT AS DIFF_WEIGHT
from HAVE(where=(id=1))
GROUP BY ID)
group by id;
QUIT; &lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;SAS Output&lt;/P&gt;
&lt;DIV class="branch"&gt;
&lt;TABLE class="systitleandfootercontainer" border="0" summary="Page Layout" width="100%" frame="void" rules="none" cellspacing="1" cellpadding="1"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="c systemtitle"&gt;The SAS System&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;BR /&gt;
&lt;DIV&gt;
&lt;DIV align="center"&gt;
&lt;TABLE class="table" summary="Procedure Print: Data Set WORK.ERROR_FIRST_DIAGNOSIS1" frame="box" rules="all" cellspacing="0" cellpadding="5"&gt;&lt;COLGROUP&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt;&lt;/COLGROUP&gt;
&lt;THEAD&gt;
&lt;TR&gt;
&lt;TH class="r header" scope="col"&gt;SER_DATE_DX&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;DOD_DIFF&lt;/TH&gt;
&lt;TH class="l header" scope="col"&gt;CODE&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;ID&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;WGT_DX&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;MIN_DIFF&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;MIN_WGT&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;DIFF_WEIGHT&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;WEIGHTED_AVER&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17006&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="l data"&gt;197&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;3&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17028&lt;/TD&gt;
&lt;TD class="r data"&gt;21&lt;/TD&gt;
&lt;TD class="l data"&gt;197&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;3&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17032&lt;/TD&gt;
&lt;TD class="r data"&gt;25&lt;/TD&gt;
&lt;TD class="l data"&gt;153&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17031&lt;/TD&gt;
&lt;TD class="r data"&gt;24&lt;/TD&gt;
&lt;TD class="l data"&gt;153&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17028&lt;/TD&gt;
&lt;TD class="r data"&gt;21&lt;/TD&gt;
&lt;TD class="l data"&gt;153&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17006&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="l data"&gt;153&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17013&lt;/TD&gt;
&lt;TD class="r data"&gt;6&lt;/TD&gt;
&lt;TD class="l data"&gt;197&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;3&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17013&lt;/TD&gt;
&lt;TD class="r data"&gt;6&lt;/TD&gt;
&lt;TD class="l data"&gt;153&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17006&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="l data"&gt;197&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;3&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;17028&lt;/TD&gt;
&lt;TD class="r data"&gt;21&lt;/TD&gt;
&lt;TD class="l data"&gt;197&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;3&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;TD class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;All 1s being 10/10=1&lt;/P&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Wed, 14 Nov 2018 15:58:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512979#M138199</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-11-14T15:58:36Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate weighted average across rows with conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512985#M138203</link>
      <description>&lt;P&gt;Here's my opinion:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Computing weighted averages in PROC SUMMARY is an awful lot easier than doing it in PROC SQL. In this case, you would have to rearrange the data so that each score is its own record. But it often happens that the data shown is the result of taking a data set where it is already in the desired format and then for presentation purposes making long rows for each patient/observation. But I admit that I haven't gone through the information provided in the original post in detail to see if PROC SUMMARY is really better than PROC SQL for this specific application.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In addition, this SQL code&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;SUM(DIFF_WEIGHT)/SUM(MIN_WGT) as WEIGHTED_AVERAGE_SCORE&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;in certain situations with missing data will not produce the proper weighted average. If the variable in the numerator is missing and the variable in the denominator is not missing, or&amp;nbsp;&lt;EM&gt;vice versa&lt;/EM&gt;, then you get an incorrect weighted average. PROC SUMMARY does not have this deficiency.&lt;/P&gt;</description>
      <pubDate>Wed, 14 Nov 2018 16:05:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/512985#M138203</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2018-11-14T16:05:44Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate weighted average across rows with conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/513023#M138208</link>
      <description>&lt;P&gt;Thanks a lot!&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/138205"&gt;@novinosrin&lt;/a&gt;,&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;,&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/31461"&gt;@mkeintz&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;I'm working on it with all your comments back in mind.&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;From the interactions and the questions posted here, I admit that that I'm getting to know what I want to achieve better. This is a simulation exercise using known and unknown data. In the known data, I have both 'date of visit' and 'date of diagnosis' variables available. In the unknown data, I have only&amp;nbsp;&lt;SPAN&gt;&amp;nbsp;'date of visit' (ser_date_dx)&amp;nbsp; but no 'date of diagnosis'. therefore, the goal is to estimate the error using the known data and apply the algorithm associated with the least error to the unknown data.&amp;nbsp;&lt;/SPAN&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;DOD_DIFF in known data is the difference between the&amp;nbsp;date of visit and the date of diagnosis , attached&amp;nbsp;to this forum.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The error that I'm trying to estimate is:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;- If pick the row associated with the&amp;nbsp;earliest 'date of visit' (min(ser_date_dx)) and the highest weight diagnosis code (min(wgt_dx)) how much error will be&amp;nbsp;produced&amp;nbsp;to compare that you would have used (DOD_DIFF) which is available only in known data but not available in unknown data. Sorry for repeating seemingly simple concept here. But it helps myself to understand the problem better &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; to be honest.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="RED: WHAT YOU CHOOSE WITH UNKNOWN DATA. BLUE: WITH KNOWN DATA" style="width: 600px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/24886i47E7E73168031DB6/image-size/large?v=v2&amp;amp;px=999" role="button" title="SCENARIO NOV12.png" alt="RED: WHAT YOU CHOOSE WITH UNKNOWN DATA. BLUE: WITH KNOWN DATA" /&gt;&lt;span class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;RED: WHAT YOU CHOOSE WITH UNKNOWN DATA. BLUE: WITH KNOWN DATA&lt;/span&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;So the code below cover the part to calculate the weighted average using the date of visit (ser_date_dx). But the&amp;nbsp;missing piece now&amp;nbsp;is the error&amp;nbsp;estimate&amp;nbsp;compare to the&amp;nbsp;scenario&amp;nbsp;using DOD_DIFF.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC SQL;
create table ERROR_FIRST_DIAGNOSIS as
select *, MIN(SER_DATE_DX) AS MIN_DIFF,
          MIN(WGT_DX) AS MIN_WGT,
		  calculated MIN_DIFF*calculated MIN_WGT AS DIFF_WEIGHT
from HAVE(where=(id=4350))
GROUP BY ID;
QUIT; 
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Please keep questioning me. This post is only evolving with your critical thinking! I truly greatly appreciate your time!&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Nov 2018 18:35:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/513023#M138208</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2018-11-14T18:35:38Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate weighted average across rows with conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/513029#M138209</link>
      <description>&lt;P&gt;"&amp;nbsp;But the&amp;nbsp;missing pieace now&amp;nbsp;is the r&lt;EM&gt;elative error&amp;nbsp;to the scenario&amp;nbsp;using DOD_DIFF.&lt;/EM&gt; "&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Do you have the logic for that?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Nov 2018 18:26:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/513029#M138209</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-11-14T18:26:41Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate weighted average across rows with conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/513034#M138210</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/138205"&gt;@novinosrin&lt;/a&gt;, would you please rephrase your question? It sounds important aspect that I have to understand.&lt;/P&gt;</description>
      <pubDate>Wed, 14 Nov 2018 18:30:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/513034#M138210</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2018-11-14T18:30:27Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate weighted average across rows with conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/513040#M138211</link>
      <description>&lt;P&gt;Honestly, I am the one who is lagging here although it is so very interesting. I was basically trying to see where and how we can progress to the next step in code development. I assumed the logic for calculating relative error as you mentioned seemed the next. I see an explanation here&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;"The error that I'm trying to estimate is:&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;&amp;nbsp;- If pick the row associated with the&amp;nbsp;earliest 'date of visit' (min(ser_date_dx)) and the highest weight diagnosis code (min(wgt_dx)) how much error will be&amp;nbsp;produced&amp;nbsp;to compare that you would have used (DOD_DIFF) which is available only in known data but not available in unknown data. Sorry for repeating seemingly simple concept here. But it helps myself to understand the problem better &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; to be honest. "&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;&amp;nbsp;&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;However I haven't comprehended that enough to translate to a code. If Paige, Mark or somebody could push that understanding further, my fingers are waiting&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Nov 2018 18:36:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/513040#M138211</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-11-14T18:36:21Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate weighted average across rows with conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/513051#M138212</link>
      <description>I attempted to visualize the logic where error is produced in the image attached. The error is produced between two scenarios: red and blue. Most often times, you end up picking the same row based on the date of visit (ser_date_dx) and the highest evidence diagnosis (min(WGT_DX))  even if you don't know the difference between the date of visit and date of diagnosis which is DOD_DIFF. However, you're potential to choose different row if your decision was based on the DOD_DIFF and the highest weight diagnosis (min(WGT_DX)) in some cases. So, the question becomes, how much error are you making if you make your decision based on the best available information to you which are data of visit (ser_date_dx) and min(WGT_DX) because you did not know the true value of DOD_DIFF. Unfortunately, my understanding is limited to a conceptualization but not coding. I hope the experts you mentioned would reach out to this post and share their insights too. Thanks for asking! Bear with me please.</description>
      <pubDate>Wed, 14 Nov 2018 18:47:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/513051#M138212</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2018-11-14T18:47:27Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate weighted average across rows with conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/513072#M138213</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/132289"&gt;@Cruise&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here's what I understand you to be saying:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;In the absence of DOD_DIFF, then for each ID:
&lt;OL&gt;
&lt;LI&gt;choose the earliest data of visit&lt;/LI&gt;
&lt;LI&gt;If there are multiple records for that data, choose the highest WGT_DX, call is WGT_DX_PARTIAL&lt;/LI&gt;
&lt;LI&gt;You want be one such value per ID.&amp;nbsp; (There seems to be no reference to the variable CODE in your numeric example).&lt;/LI&gt;
&lt;/OL&gt;
&lt;/LI&gt;
&lt;LI&gt;In the presence of DOD_DIFF, for each ID:
&lt;OL&gt;
&lt;LI&gt;choose the minimum DOD_DIFF&lt;/LI&gt;
&lt;LI&gt;If there are multiple records for that value of DOD_DIFF&amp;lt; choose the highest WGT_DX, call it WGT_DX_FULL&lt;/LI&gt;
&lt;LI&gt;Again you want one such value per ID&lt;/LI&gt;
&lt;/OL&gt;
&lt;/LI&gt;
&lt;LI&gt;I presume your measurement of error, is just&amp;nbsp;WGT_DX_ERROR=&amp;nbsp;WGT_DX_FULL - WGT_DX_PARTIAL.&amp;nbsp; Again this is one value per ID, correct?&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;
&lt;LI&gt;So dos this mean you simply want to regress WGT_DX_FULL on WGT_DX_PARTIAL?&amp;nbsp; If so, you don't need table 2, just table 1.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Nov 2018 19:44:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/513072#M138213</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2018-11-14T19:44:22Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate weighted average across rows with conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/513102#M138218</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/31461"&gt;@mkeintz&lt;/a&gt;, &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/138205"&gt;@novinosrin&lt;/a&gt;, &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#0000ff"&gt;Mkeintz,&amp;nbsp;Thank you for&amp;nbsp;your amazingly&amp;nbsp;systematic presentation of&amp;nbsp;the problem in clear subsequent logics. You also had&amp;nbsp;coined in&amp;nbsp;a new&amp;nbsp;idea that&amp;nbsp;I have not seen for&amp;nbsp;possibility. That is to use two parallel scenario: absence vs presence of DOD_DIFF using the 'known' data. And also&amp;nbsp;suggesting an alternative way to look at&amp;nbsp;a measurement error. Please, hoping your time allows, review my responses in red in the text below.&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;In the absence of DOD_DIFF, then for each ID:
&lt;OL&gt;
&lt;LI&gt;Choose the earliest data of visit&lt;/LI&gt;
&lt;LI&gt;If there are multiple records for that data, choose the&amp;nbsp;&lt;FONT color="#ff0000"&gt;lowest&lt;/FONT&gt; WGT_DX, call is WGT_DX_PARTIAL&lt;/LI&gt;
&lt;LI&gt;You want be one such value per ID&lt;/LI&gt;
&lt;LI&gt;(&lt;FONT color="#ff0000"&gt;There seems to be no reference to the variable CODE in your numeric example) The derivation of WGT_DX is following: &lt;/FONT&gt;
&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;IF&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; CODE IN (&lt;/FONT&gt;&lt;FONT color="#800080" face="Courier New"&gt;'153'&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;) &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;THEN&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; WGT_DX=&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New"&gt;1&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New"&gt;; &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;ELSE&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;IF&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; CODE IN (&lt;/FONT&gt;&lt;FONT color="#800080" face="Courier New"&gt;'154'&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;) &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;THEN&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; WGT_DX=&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New"&gt;1&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New"&gt;; &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;ELSE&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;IF&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; CODE IN (&lt;/FONT&gt;&lt;FONT color="#800080" face="Courier New"&gt;'159'&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;) &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;THEN&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; WGT_DX=&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New"&gt;2&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New"&gt;; &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;ELSE&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;IF&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; CODE IN (&lt;/FONT&gt;&lt;FONT color="#800080" face="Courier New"&gt;'197'&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;) &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;THEN&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; WGT_DX=&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New"&gt;3&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New"&gt;; &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;ELSE&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;IF&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; CODE IN (&lt;/FONT&gt;&lt;FONT color="#800080" face="Courier New"&gt;'199'&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;) &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;THEN&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; WGT_DX=&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New"&gt;3&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New"&gt;; &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;ELSE&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;IF&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; CODE IN (&lt;/FONT&gt;&lt;FONT color="#800080" face="Courier New"&gt;'209'&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;) &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;THEN&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; WGT_DX=&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New"&gt;5&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New"&gt;; &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;ELSE&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;IF&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; CODE IN (&lt;/FONT&gt;&lt;FONT color="#800080" face="Courier New"&gt;'211'&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;) &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;THEN&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; WGT_DX=&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New"&gt;6&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New"&gt;; &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;ELSE&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;IF&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; CODE IN (&lt;/FONT&gt;&lt;FONT color="#800080" face="Courier New"&gt;'230'&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;) &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;THEN&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; WGT_DX=&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New"&gt;2&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New"&gt;; &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;ELSE&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;IF&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; CODE IN (&lt;/FONT&gt;&lt;FONT color="#800080" face="Courier New"&gt;'235'&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;) &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;THEN&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; WGT_DX=&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New"&gt;3&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New"&gt;; &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;ELSE&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;IF&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; CODE IN (&lt;/FONT&gt;&lt;FONT color="#800080" face="Courier New"&gt;'239'&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;) &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;THEN&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; WGT_DX=&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New"&gt;3&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New"&gt;; &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;ELSE&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;IF&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; CODE IN (&lt;/FONT&gt;&lt;FONT color="#800080" face="Courier New"&gt;'V10'&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;) &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;THEN&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; WGT_DX=&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New"&gt;4&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New"&gt;; &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;ELSE&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;IF&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; CODE IN (&lt;/FONT&gt;&lt;FONT color="#800080" face="Courier New"&gt;'V76'&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;) &lt;/FONT&gt;&lt;FONT color="#0000ff" face="Courier New"&gt;THEN&lt;/FONT&gt;&lt;FONT face="Courier New"&gt; WGT_DX=&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New"&gt;4&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New"&gt;; &lt;/FONT&gt;&lt;/P&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;/LI&gt;
&lt;LI&gt;In the presence of DOD_DIFF, for each ID:
&lt;OL&gt;
&lt;LI&gt;Choose the minimum DOD_DIFF&lt;/LI&gt;
&lt;LI&gt;If there are multiple records for that value of DOD_DIFF&amp;lt; choose the &lt;FONT color="#ff0000"&gt;lowest&lt;/FONT&gt; WGT_DX, call it WGT_DX_FULL&lt;/LI&gt;
&lt;LI&gt;Again you want one such value per ID&lt;/LI&gt;
&lt;/OL&gt;
&lt;/LI&gt;
&lt;LI&gt;I presume your measurement of error, is just&amp;nbsp;WGT_DX_ERROR=&amp;nbsp;WGT_DX_FULL - WGT_DX_PARTIAL.&amp;nbsp; Again this is one value per ID, correct? &lt;FONT color="#ff0000"&gt;CORRECT.&lt;/FONT&gt;&amp;nbsp;&lt;FONT color="#ff0000"&gt;However, for the next step,&amp;nbsp;I will have to quantify the error associated with each diagnostic CODE (diagnosis, in this case, 12 distinct diagnosis codes, as listed above. For example, 153 (colon cancer)&amp;nbsp;and 154(rectum cancer) are the strongest evidence to suggest a colorectal cancer. However, how much error would be associated with each codes&amp;nbsp;in terms of time factor&amp;nbsp;regardless the level of evidence&amp;nbsp;to&amp;nbsp;lead to a&amp;nbsp;true&amp;nbsp;diagnosis. But for now, I'm focusing on the error per patient due to DOD_DIFF known vs unknown scenarios.&amp;nbsp;&amp;nbsp;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#ff0000"&gt;- Regarding WGT_DX_ERROR=&amp;nbsp;WGT_DX_FULL - WGT_DX_PARTIAL. I agree, the difference between weights could be a proxy for an error. However, would it be considered weighted? Also, WGT_DX is monotonic arbitrary numbers that&amp;nbsp;I have assigned to each diagnosis CODE&amp;nbsp;myself based on my personal judgment as to how far/close the given diagnosis CODE for a colorectal cancer, where 153 and 154 being the correct reference. &amp;nbsp;With that said, how about using the&amp;nbsp;difference between 'date of visit' that is corresponded with absence vs presence of DOD_DIFF&amp;nbsp;scenarios respectively and&amp;nbsp;multiply by the WGT_DX_ERROR you suggested? to have time factor weighed in the error estimate? Please correct me, if it doesn't make sense. The error produced here is not only the compromise to the correct diagnosis but also a time-factor. At the end of this project, we're aiming for&amp;nbsp;the best&amp;nbsp;value to replace a missing&amp;nbsp;'date of diagnosis' in the unknown data. &lt;/FONT&gt;&lt;/LI&gt;
&lt;LI&gt;So this mean you simply want to &lt;FONT color="#ff0000"&gt;regress&lt;/FONT&gt; WGT_DX_FULL on WGT_DX_PARTIAL?&amp;nbsp;&lt;FONT color="#ff0000"&gt;Are you suggesting a linear regression between these two continuous variables and look at the estimates? It sounds very interesting, do you have time to elaborate on this a little more?&lt;/FONT&gt; &amp;nbsp;If so, you don't need table 2, just table 1. &lt;FONT color="#ff0000"&gt;Do you mind to clarify which table are you referring to?&lt;/FONT&gt;&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&lt;FONT color="#ff0000"&gt;This is a rewarding learning experience. I'm so excited to continuously learning from you guys!&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Nov 2018 20:57:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/513102#M138218</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2018-11-14T20:57:52Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate weighted average across rows with conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/513193#M138257</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp; please let me know your comments on my response to &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/31461"&gt;@mkeintz&lt;/a&gt;'s suggestions. &lt;/P&gt;</description>
      <pubDate>Thu, 15 Nov 2018 02:47:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculate-weighted-average-across-rows-with-conditions/m-p/513193#M138257</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2018-11-15T02:47:46Z</dc:date>
    </item>
  </channel>
</rss>

