<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to implement different formulas in long format data that differ depending on previous value in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-to-implement-different-formulas-in-long-format-data-that/m-p/707886#M217425</link>
    <description>&lt;P&gt;A BY statement would create for you variables FIRST. and LAST.subject_number.&amp;nbsp; You may use these to signal when subject number has changed or is about to change in the next, future record read.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://documentation.sas.com/?cdcId=pgmsascdc&amp;amp;cdcVersion=9.4_3.5&amp;amp;docsetId=lestmtsref&amp;amp;docsetTarget=p0yeyftk8ftuckn1o5qzy53284gz.htm&amp;amp;locale=en"&gt;SAS Help Center: BY Statement&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 23 Dec 2020 13:50:18 GMT</pubDate>
    <dc:creator>PhilC</dc:creator>
    <dc:date>2020-12-23T13:50:18Z</dc:date>
    <item>
      <title>How to implement different formulas in long format data that differ depending on previous value</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-implement-different-formulas-in-long-format-data-that/m-p/707851#M217405</link>
      <description>&lt;P&gt;Hello! I have a dataset that is in long format. Each subject is supposed to have an outcome score at timepoints: 0 hours, 5 minutes, 15 minutes, and 30 minutes.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My data look like this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
data  datafile;
	input subject_number timepoint $ minutes score remove ;
	datalines;
	1 0h 0 10 0 
	1 5m 5 9 0 
	1 10m 10 10 0 
	1 15m 15 10 0 
	1 30m 30 11 0 
	2 0h 0 11 0 
	2 5m 5 . 0 
	2 15m 15 10 0 
	2 30m 30 . 1 
	;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;BR /&gt;These data only consist of two subjects for simplicity sake.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Variables:&lt;/P&gt;
&lt;P&gt;subject_number = subject numer&lt;/P&gt;
&lt;P&gt;timepoint = timepoint each response variable was taken at&lt;/P&gt;
&lt;P&gt;minutes = x-axis variable. This is the numeric version of timepoint&amp;nbsp;&lt;/P&gt;
&lt;P&gt;score = the y-axis/dependent variable&lt;/P&gt;
&lt;P&gt;remove = this variable = 1 if either endpoint is missing (0h or 30 min)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My code would look something like this [with some of my own notes inserted as comments]:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
/*Note: If subject is missing start time (0h) or end time (30m), then remove = 1*/
/*use the 'linear up, log down' method'*/

data datafile2;
	set datafile;
 
/*	linear formula:
	auc = 1/2* (score_i +score_i+1) * (t_i+1 - t_i)*/
	
/*When I had calculated area using only the linear trapezoidal method, this is what code I used: */
	lagtime = lag(minutes);
	lagvalue = lag(score);
	if minutes = 0 then do;
		lagtime = 0;
		lagvalue = 0;
	end;
	trapezoidScore = (minutes-lagtime)*(score + Lagvalue)/2;
	 SumTrapezoidScroe + TrapezoidScore;*/
	
	/*log forumla: (score_i  - score_i+1)/(ln(score_i)-ln(score_i+1))*(t_i+1-t_i)*/
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Basically, what I need to code is:&lt;/P&gt;
&lt;P&gt;(a) If the score at timepoint i+1 is greater than or equal to the score at timepoint i, then use the &lt;U&gt;linear trapezoidal method&lt;/U&gt;&amp;nbsp;to calculate the area from timepoint i to timepoint i+1.&lt;/P&gt;
&lt;P&gt;(b) If the score at timepoint i+1 is less than the score at timepoint i, then use the &lt;U&gt;logarithmic trapezoidal method&lt;/U&gt; to calculate the area from timepoint i to timepoint i+1.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The &lt;U&gt;linear trapezoidal method&lt;/U&gt; formula (what you'd use to calculate the area going 'up') is:&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="linear trapezoidal method.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/52982iC1A80A16C612524E/image-size/medium?v=v2&amp;amp;px=400" role="button" title="linear trapezoidal method.png" alt="linear trapezoidal method.png" /&gt;&lt;/span&gt;&amp;nbsp;where C1 and C2 are the y values (scores in our case), and t1 and t2 are the timepoints on the x-axis.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The &lt;U&gt;logarithmic trapezoidal method&lt;/U&gt; formula (what you'd use to calculate the area going 'down' is:&amp;nbsp;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="logarithmic trapezoidal method.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/52983i7396835975703CDE/image-size/medium?v=v2&amp;amp;px=400" role="button" title="logarithmic trapezoidal method.png" alt="logarithmic trapezoidal method.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I took these formulas and this idea of "linear-up log-down" from &lt;A href="https://www.certara.com/knowledge-base/calculating-auc-linear-and-log-linear/#:~:text=Linear%20Trapezoidal%20Method&amp;amp;text=When%20you%20sum%20all%20of,over%20the%20total%20time%20interval" target="_blank" rel="noopener"&gt;this short article&lt;/A&gt;&amp;nbsp;.&lt;/P&gt;
&lt;P&gt;If, say, the score at timepoint i+1 is missing, but is not missing at timepoint i and timepoint i+2, then calculate the area but using timepoint i and timepoint i+2's scores and timepoints (and depending on if the score at timepoint i &amp;gt;= or &amp;lt; the score at timepoint i+2, use either the linear trapezoidal method or logarithmic trapezoidal method. I've indicated such timepoints as you can see for subject 2 at 5 minutes.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Examples in the sample data:&amp;nbsp;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Subject 1 from 0h to 5m, his score &lt;STRONG&gt;goes down&lt;/STRONG&gt; from 10 to 9. Here, we'd use the &lt;U&gt;logarithmic trapezoidal method&lt;/U&gt; formula to calculate the area under the curve for this section:
&lt;UL&gt;
&lt;LI&gt;AUC_0h_5m =&amp;nbsp; [(10 - 9)/(ln(10) - ln(9))]*(5-0)&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;Subject 1 from 5m to 10m, his score &lt;STRONG&gt;goes up&lt;/STRONG&gt; from 9 to 10. Here, we'd use the&lt;U&gt; linear trapezoidal&lt;/U&gt; method formula to calculate the area under the curve for this section:
&lt;UL&gt;
&lt;LI&gt;AUC_5m_10m = 0.5(9 + 10) * (10 - 5)&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;Subject 1 from 10m to 15m, his score remains the &lt;STRONG&gt;same&lt;/STRONG&gt;. Use the &lt;U&gt;linear trapezoidal method&lt;/U&gt; formula to calculate the area under the curve for this section:
&lt;UL&gt;
&lt;LI&gt;AUC_10m_15m = 0.5(10+10) * (15-10)&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;Subject 2 has a score at 0h and 10m, but is missing score at 5m. Also, his score &lt;STRONG&gt;went down&lt;/STRONG&gt; from timepoint 0h to 10m. Use the &lt;U&gt;logarithmic trapezoidal method&lt;/U&gt; to calculate the area under the curve for this section:
&lt;UL&gt;
&lt;LI&gt;AUC_0h_15m = [(11-10)/(ln(11)-ln(10))]*(15-0)&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;Subject 2 is missing a score at the last timepoint, which is 30m. We will ignore this part.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Then, take all of the AUC's calculated above (and the timepoints that I didn't give an example for) and sum them per subject.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I need help on how to do this since my data are in long format. How do I tell SAS to skip to the next record, and look back at the previous record, depending on which is bigger, implement a different formula, and if the timepoint doesn't exist, then skip and go to the next one? Here, some thing to note: I don't always have a record such as subject 2, 30m, where the score is ".".. sometimes they just don't have a record at all, and sometimes it's there as "." (missing). For sake of this example, I just didn't include a 10m record for subject 2. So SAS would need to know to skip over the missing 10m..&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you so much!!!&lt;/P&gt;
&lt;P&gt;Best,&lt;/P&gt;
&lt;P&gt;Gina&lt;/P&gt;</description>
      <pubDate>Wed, 23 Dec 2020 09:02:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-implement-different-formulas-in-long-format-data-that/m-p/707851#M217405</guid>
      <dc:creator>ginak</dc:creator>
      <dc:date>2020-12-23T09:02:07Z</dc:date>
    </item>
    <item>
      <title>Re: How to implement different formulas in long format data that differ depending on previous value</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-implement-different-formulas-in-long-format-data-that/m-p/707886#M217425</link>
      <description>&lt;P&gt;A BY statement would create for you variables FIRST. and LAST.subject_number.&amp;nbsp; You may use these to signal when subject number has changed or is about to change in the next, future record read.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://documentation.sas.com/?cdcId=pgmsascdc&amp;amp;cdcVersion=9.4_3.5&amp;amp;docsetId=lestmtsref&amp;amp;docsetTarget=p0yeyftk8ftuckn1o5qzy53284gz.htm&amp;amp;locale=en"&gt;SAS Help Center: BY Statement&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 23 Dec 2020 13:50:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-implement-different-formulas-in-long-format-data-that/m-p/707886#M217425</guid>
      <dc:creator>PhilC</dc:creator>
      <dc:date>2020-12-23T13:50:18Z</dc:date>
    </item>
  </channel>
</rss>

