<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Want to keep latest date when merging data in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Want-to-keep-latest-date-when-merging-data/m-p/703949#M215748</link>
    <description>&lt;P&gt;Assuming your data are already sorted by id/date, then:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  merge library.data_dates (where=(Year(date)=2006)  in=in2006)
        library.data_dates ;
  by id;
  if in2006=1 and last.id=1;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Let's say there are 5 obs from 2006 and 20 obs overall for a given id.&amp;nbsp; The first 4 obs from 2006 will be paired with the first 4 obs overall.&amp;nbsp; The last obs from 2006 will be paired with ALL the remaining overall obs.&amp;nbsp; So for the last obs, the IN2006 dummy will still be on, and the data will be output.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also, when I say "paired" please note that the data values from 2006 will always be overwritten by the equivalent data variables from the overall - because the overall data set is to the right of the 2006 data subset in the merge statement.&amp;nbsp; &amp;nbsp; Only if there were certain variables dropped in the second data set reference would that variable value from the first reference prevail in the output.&lt;/P&gt;</description>
    <pubDate>Sun, 06 Dec 2020 20:50:03 GMT</pubDate>
    <dc:creator>mkeintz</dc:creator>
    <dc:date>2020-12-06T20:50:03Z</dc:date>
    <item>
      <title>Want to keep latest date when merging data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Want-to-keep-latest-date-when-merging-data/m-p/703946#M215747</link>
      <description>&lt;P&gt;Relatively new to SAS so I'll need specifics. I am using SAS University, in case that makes any difference.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a lot of large data sets I am working with. I would like to do Survival analysis with them. To my understanding, I can use the following to get unique ID's with the latest date:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC SORT LIBRARY.DATA;
 BY ID DATE;
RUN;

DATA LIBRARY.DATA_DATES (keep = ID DATE);
 SET LIBRARY.DATA;
 BY ID;
 ENDDATE = MAX(DATE);&lt;BR /&gt; STARTDATE = MIN(DATE); *only for 2006 data;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Please let me know if the code is wrong. My dataset is extremely large, so I won't be able to manually check so easily.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Now I want to merge the datasets (Years 2006-2013). When merging, I still want only the newest date, but I only want to keep IDs that are present in 2006 (start period). If I left merge, will the 2006 dates override the 2007-2013 dates? My other thought is to label the MAX DATE for each year (i.e. in own column) and then a separate column from that with the new MAX DATE. However, this doesn't sound the most efficient. Is there a better way to code for this?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The end goal with the dates is to use them for follow-up time. That is,&lt;/P&gt;&lt;PRE&gt;FOLLOWUP = MAX DATE - MIN 2006 DATE&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I appreciate any help or feedback.&lt;/P&gt;</description>
      <pubDate>Sun, 06 Dec 2020 19:18:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Want-to-keep-latest-date-when-merging-data/m-p/703946#M215747</guid>
      <dc:creator>amarikow57</dc:creator>
      <dc:date>2020-12-06T19:18:28Z</dc:date>
    </item>
    <item>
      <title>Re: Want to keep latest date when merging data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Want-to-keep-latest-date-when-merging-data/m-p/703949#M215748</link>
      <description>&lt;P&gt;Assuming your data are already sorted by id/date, then:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  merge library.data_dates (where=(Year(date)=2006)  in=in2006)
        library.data_dates ;
  by id;
  if in2006=1 and last.id=1;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Let's say there are 5 obs from 2006 and 20 obs overall for a given id.&amp;nbsp; The first 4 obs from 2006 will be paired with the first 4 obs overall.&amp;nbsp; The last obs from 2006 will be paired with ALL the remaining overall obs.&amp;nbsp; So for the last obs, the IN2006 dummy will still be on, and the data will be output.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also, when I say "paired" please note that the data values from 2006 will always be overwritten by the equivalent data variables from the overall - because the overall data set is to the right of the 2006 data subset in the merge statement.&amp;nbsp; &amp;nbsp; Only if there were certain variables dropped in the second data set reference would that variable value from the first reference prevail in the output.&lt;/P&gt;</description>
      <pubDate>Sun, 06 Dec 2020 20:50:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Want-to-keep-latest-date-when-merging-data/m-p/703949#M215748</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2020-12-06T20:50:03Z</dc:date>
    </item>
    <item>
      <title>Re: Want to keep latest date when merging data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Want-to-keep-latest-date-when-merging-data/m-p/703950#M215749</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=library.data; /* PROC SORT statement needs a DATA= option */
by id date;
run;

data library.data_dates;
set library.data (keep = id date);
by id;
format startdate enddate yymmdd10.;
retain startdate;
if first.id then startdate = date;
if last.id; /* will output only the last observation per id group */
enddate = date;
drop date;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;No need to shout at SAS, it will work when being talked to in normal voice.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In PROC SQL, it looks like this:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table library.data_dates as
  select
    id,
    min(date) as startdate format=yymmdd10.
    max(date) as enddate format = yymmdd10.
  from library.dates
  group by id
;
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sun, 06 Dec 2020 19:41:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Want-to-keep-latest-date-when-merging-data/m-p/703950#M215749</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-12-06T19:41:56Z</dc:date>
    </item>
  </channel>
</rss>

