<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: start time and duration of a sequence in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/start-time-and-duration-of-a-sequence/m-p/253441#M48230</link>
    <description>Very nice! Thanks! Note, the data want near the bottom should be changed to "set summarized". Also, leaving the _freq_ in the proc summary provides the duration in minutes w/o needing the additional data step. Thanks a lot for the hand. I often have data with skips in the time and have struggled with numbering contiguous sequences. Now I know how.</description>
    <pubDate>Tue, 01 Mar 2016 14:02:25 GMT</pubDate>
    <dc:creator>brucehughw</dc:creator>
    <dc:date>2016-03-01T14:02:25Z</dc:date>
    <item>
      <title>start time and duration of a sequence</title>
      <link>https://communities.sas.com/t5/SAS-Programming/start-time-and-duration-of-a-sequence/m-p/253246#M48147</link>
      <description>&lt;P&gt;hello,&lt;/P&gt;
&lt;P&gt;I have data that includes a time variable and weather descriptions from rater1 and rater2, e.g.,&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input time : time5. rater1 : $ rater2 : $ ;
format  time time5.  ;
cards; 
1:01       RA DZ
1:02       RA DZ
1:03       RA DZ
2:06       DZ PL
2:07       DZ PL&lt;BR /&gt;2:15       PL
;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;these sequences can go quite long (not just three minutes, but maybe three hours), and there are many of them. What I'd like is a summary of each, something like:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
input startTime : time5. duration rater1 : $ rater2 : $ ;
format  time time5.  ;
cards; 
1:01   3    RA DZ
2:06   2    DZ PL&lt;BR /&gt;2:15   1    PL
;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;rater1 and rater2 values include RA, DZ, PL, ' ', SN, RAPL, SNPL, and RASN. Any suggestions?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks, Bruce&lt;/P&gt;</description>
      <pubDate>Mon, 29 Feb 2016 18:45:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/start-time-and-duration-of-a-sequence/m-p/253246#M48147</guid>
      <dc:creator>brucehughw</dc:creator>
      <dc:date>2016-02-29T18:45:54Z</dc:date>
    </item>
    <item>
      <title>Re: start time and duration of a sequence</title>
      <link>https://communities.sas.com/t5/SAS-Programming/start-time-and-duration-of-a-sequence/m-p/253262#M48152</link>
      <description>&lt;P&gt;Bruce,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Your question leaves a number of questions open to interpretation.&amp;nbsp; Perhaps you could narrow down the problem by addressing a few of these.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What is the definition of a sequence?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If enough time passes, but the raters stay the same, would that begin a new sequence?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Within the same sequence could the raters switch positions (so that rater #1 becomes rater #2 and vice versa)?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is duration a count of records, or does it represent a calculation based on the first and last TIME value?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Can two sequences overlap?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What is the order to the incoming data records?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Three of these questions are really intertwined:&amp;nbsp; the definition of a sequence, overlapping sequences, the order to the incoming records.&amp;nbsp; They are all ways of looking at how the data identifies a sequence.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The program might be as simple as:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc summary data=have nway;&lt;/P&gt;
&lt;P&gt;class rater1 rater2;&lt;/P&gt;
&lt;P&gt;var time;&lt;/P&gt;
&lt;P&gt;output out=want (drop-_type_ rename=(_freq_=duration)) min=start_time;&lt;/P&gt;
&lt;P&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But I feel like I'm guessing at what needs to be done.&lt;/P&gt;</description>
      <pubDate>Mon, 29 Feb 2016 19:49:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/start-time-and-duration-of-a-sequence/m-p/253262#M48152</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2016-02-29T19:49:00Z</dc:date>
    </item>
    <item>
      <title>Re: start time and duration of a sequence</title>
      <link>https://communities.sas.com/t5/SAS-Programming/start-time-and-duration-of-a-sequence/m-p/253282#M48164</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;thanks for looking into my question. To answer your questions:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What is the definition of a sequence? &lt;EM&gt;A sequence comprises a contiguous block of time, e.g. 1:01 1:02 1:03, the same value for rater1, and the same value for rater2. If the time skips a given minute or a rater's value changes, the original sequence ends.&lt;/EM&gt;&amp;nbsp;Typically, a new sequence will begin when the time will skip a value. I'd be satisifed with this solution (sequences based on this skipping). But code that watches for both skipping time and changes in a rater's value &amp;nbsp;would be very nice.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If enough time passes, but the raters stay the same, would that begin a new sequence? if depends&lt;EM&gt;, any gap larger than a minute breaks the sequence. If the time does not skip any minutes, the sequence continues.&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Within the same sequence could the raters switch positions (so that rater #1 becomes rater #2 and vice versa)? &lt;EM&gt;No. If either rater changes their "report," e.g., one switches from PL to DZ, this begins a new sequence.&amp;nbsp;&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is duration a count of records, or does it represent a calculation based on the first and last TIME value? &lt;EM&gt;Duration is last time - first time + 1 minute&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Can two sequences overlap? &lt;EM&gt;No, time is monotonically increasing (always increasing)&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;&amp;nbsp;&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;What is the order to the incoming data records?&lt;EM&gt; sorted by time&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Three of these questions are really intertwined:&amp;nbsp; the definition of a sequence, overlapping sequences, the order to the incoming records.&amp;nbsp; They are all ways of looking at how the data identifies a sequence.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Your proc summary worked on my toy set. But if I add a new pair of RA DZ at 1:08, proc summary does not return the correct value. This new pair would start a new sequence&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks very much, Bruce&lt;/P&gt;</description>
      <pubDate>Mon, 29 Feb 2016 20:50:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/start-time-and-duration-of-a-sequence/m-p/253282#M48164</guid>
      <dc:creator>brucehughw</dc:creator>
      <dc:date>2016-02-29T20:50:30Z</dc:date>
    </item>
    <item>
      <title>Re: start time and duration of a sequence</title>
      <link>https://communities.sas.com/t5/SAS-Programming/start-time-and-duration-of-a-sequence/m-p/253296#M48175</link>
      <description>&lt;P&gt;Aha!&amp;nbsp; Thanks for the answers.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I suggest adding a variable to number the sequences.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;data with_sequence;&lt;/P&gt;
&lt;P&gt;set have;&lt;/P&gt;
&lt;P&gt;by rater1 rater2 notsorted;&lt;/P&gt;
&lt;P&gt;time_dif = dif(time);&lt;/P&gt;
&lt;P&gt;if first.rater2 then sequence + 1;&lt;/P&gt;
&lt;P&gt;else if time_dif &amp;gt; 60 then sequence + 1;&lt;/P&gt;
&lt;P&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can decide if the cutoff point of &amp;gt; 60 needs to be adjusted or not.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then summarize:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc summary data=with _sequence;&lt;/P&gt;
&lt;P&gt;by sequence rater1 rater2 notsorted;&lt;/P&gt;
&lt;P&gt;var time;&lt;/P&gt;
&lt;P&gt;output out=summarized (drop=_type_ _freq_) min=time_begins max=time_ends;&lt;/P&gt;
&lt;P&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You would still need to read the summary back in, to compute duration.&amp;nbsp; Something along these lines:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;data want;&lt;/P&gt;
&lt;P&gt;set with_sequence;&lt;/P&gt;
&lt;P&gt;duration = time_ends - time_begins + 60;&lt;/P&gt;
&lt;P&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I expect that the TIME-based statistics are measured in seconds thus you need to add 60.&amp;nbsp; But if you test this and find that's not the case, you can always adjust the formula.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Good luck.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Oops! &amp;nbsp;Added NOTSORTED a second time.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 29 Feb 2016 23:38:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/start-time-and-duration-of-a-sequence/m-p/253296#M48175</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2016-02-29T23:38:06Z</dc:date>
    </item>
    <item>
      <title>Re: start time and duration of a sequence</title>
      <link>https://communities.sas.com/t5/SAS-Programming/start-time-and-duration-of-a-sequence/m-p/253441#M48230</link>
      <description>Very nice! Thanks! Note, the data want near the bottom should be changed to "set summarized". Also, leaving the _freq_ in the proc summary provides the duration in minutes w/o needing the additional data step. Thanks a lot for the hand. I often have data with skips in the time and have struggled with numbering contiguous sequences. Now I know how.</description>
      <pubDate>Tue, 01 Mar 2016 14:02:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/start-time-and-duration-of-a-sequence/m-p/253441#M48230</guid>
      <dc:creator>brucehughw</dc:creator>
      <dc:date>2016-03-01T14:02:25Z</dc:date>
    </item>
  </channel>
</rss>

