<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Collapsing over unique ID and retaining start and stop dates in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Collapsing-over-unique-ID-and-retaining-start-and-stop-dates/m-p/418659#M102865</link>
    <description>&lt;P&gt;Thank you for sharing this! It worked like a charm. This is greatly appreciated!&lt;/P&gt;&lt;P&gt;All the best,&lt;/P&gt;&lt;P&gt;-Carmine&lt;/P&gt;</description>
    <pubDate>Wed, 06 Dec 2017 01:39:56 GMT</pubDate>
    <dc:creator>Carmine_Rossi</dc:creator>
    <dc:date>2017-12-06T01:39:56Z</dc:date>
    <item>
      <title>Collapsing over unique ID and retaining start and stop dates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Collapsing-over-unique-ID-and-retaining-start-and-stop-dates/m-p/418642#M102863</link>
      <description>&lt;P&gt;Good day SAS community forum users,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a SAS data step question that I am hoping you can help me with. It concerns collapsing over observations, for&amp;nbsp;drug treatment data with regimen_id’s and prescription start and stop dates.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Suppose I have the following structure:&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;Patient_id&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Regimen_ID&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Start_date&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Stop_date&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;A0&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;A1&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;B0&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;B1&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;C0&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;C1&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;2&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;A0&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;A1&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;2&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;B0&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;B1&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;And instead, I wanted to collapse start and stop dates over each combination of patient and regimen, so that I could generate the following (note the start date is the very first one for the regimen and the stop date is the very last for the regimen):&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;Patient_id&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Regimen&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Start_date&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Stop_date&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;A0&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;C1&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;2&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;A0&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;B1&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;How would this be done in SAS? I am a bit stumped.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks again for your help,&lt;/P&gt;&lt;P&gt;-Carmine&lt;/P&gt;</description>
      <pubDate>Wed, 06 Dec 2017 00:10:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Collapsing-over-unique-ID-and-retaining-start-and-stop-dates/m-p/418642#M102863</guid>
      <dc:creator>Carmine_Rossi</dc:creator>
      <dc:date>2017-12-06T00:10:20Z</dc:date>
    </item>
    <item>
      <title>Re: Collapsing over unique ID and retaining start and stop dates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Collapsing-over-unique-ID-and-retaining-start-and-stop-dates/m-p/418644#M102864</link>
      <description>&lt;P&gt;The following should work:&lt;/P&gt;
&lt;PRE&gt;data want (drop=_:);
  set have (rename=(Start_date=_Start_date
                    Stop_date=_Stop_date));
  retain Start_date;
  by Patient_id Regimen_ID;
  if first.Regimen_ID then Start_date=_Start_date;
  if last.Regimen_ID then do;
    Stop_date=_Stop_date;
    output;
  end;
run;
&lt;/PRE&gt;
&lt;P&gt;Art, CEO, AnalystFinder.com&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 06 Dec 2017 00:38:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Collapsing-over-unique-ID-and-retaining-start-and-stop-dates/m-p/418644#M102864</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2017-12-06T00:38:29Z</dc:date>
    </item>
    <item>
      <title>Re: Collapsing over unique ID and retaining start and stop dates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Collapsing-over-unique-ID-and-retaining-start-and-stop-dates/m-p/418659#M102865</link>
      <description>&lt;P&gt;Thank you for sharing this! It worked like a charm. This is greatly appreciated!&lt;/P&gt;&lt;P&gt;All the best,&lt;/P&gt;&lt;P&gt;-Carmine&lt;/P&gt;</description>
      <pubDate>Wed, 06 Dec 2017 01:39:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Collapsing-over-unique-ID-and-retaining-start-and-stop-dates/m-p/418659#M102865</guid>
      <dc:creator>Carmine_Rossi</dc:creator>
      <dc:date>2017-12-06T01:39:56Z</dc:date>
    </item>
    <item>
      <title>Re: Collapsing over unique ID and retaining start and stop dates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Collapsing-over-unique-ID-and-retaining-start-and-stop-dates/m-p/418663#M102867</link>
      <description>&lt;P&gt;This is a technique that uses the lag function to hold needed values rather than a temporary variable.&amp;nbsp; Much more compact in cases like this.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input Patient_id Regimen_ID Start_date :$2. Stop_date :$2.;
put (_all_) (=);
datalines;
1 1 A0 A1
1 1 B0 B1
1 1 C0 C1
1 2 A0 A1
1 3 A0 A1
1 3 B0 B1
run;
data want;
  set have;
  by patient_id regimen_id;

  if first.regimen_id or last.regimen_id;
  if first.regimen_id ^=last.regimen_id then start_date=lag(start_date);
  if last.regimen_id;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Notes:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;I changed your regimen_id=2 to a 3, and added a single-record regimen_id=2 in the middle, to demonstrate treatment of single-record date ranges.&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;
&lt;LI&gt;The first subsetting if&amp;nbsp; (&lt;EM&gt;&lt;STRONG&gt;if first.regimen_id or last.regimen_id&lt;/STRONG&gt;&lt;/EM&gt;) tells SAS to throw away all records except the first and last for each group.&lt;/LI&gt;
&lt;LI&gt;The &lt;EM&gt;&lt;STRONG&gt;"if first.regimen_id^=last.regimen_id ..."&lt;/STRONG&gt;&lt;/EM&gt; statement.&amp;nbsp; If the regimen_id has only a single record, then no updating is needed.&lt;BR /&gt;&lt;BR /&gt;But if there are separate first. and last. records, then the single-member lag queue is updated twice for that regimen_id.&amp;nbsp; Which in turn means that the result of the lag at the end of the group is the value that was put&amp;nbsp;into the lag queue at the beginning of the group. Of course in this case the lag "queue" is a queue of size one.&lt;/LI&gt;
&lt;LI&gt;The "if last.regimen_id" statement keeps one record per group -- the last record.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;By the way, the single statement&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp; &lt;EM&gt;&lt;STRONG&gt;if first.regimen_id^=last.regimen_id then start_date=lag(start_date);&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;is NOT EQUIVALENT to the pair of statements&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp; if first.regimen_id&amp;gt;last.regimen_id then start_date=lag(start_date);&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp; if first.regimen_id&amp;lt;last.regimen_id then start_date=lag(start_date);&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;because the single statement maintains only one queue of start_date values, alternating between values of the first. and last. records.&amp;nbsp; In contrast, the two statements maintain two separate queues, one with the sequence of first. records, and the other with the sequence of last. records.&lt;/P&gt;</description>
      <pubDate>Wed, 06 Dec 2017 02:16:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Collapsing-over-unique-ID-and-retaining-start-and-stop-dates/m-p/418663#M102867</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2017-12-06T02:16:54Z</dc:date>
    </item>
  </channel>
</rss>

