<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Dates and Merging. in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Dates-and-Merging/m-p/9921#M839</link>
    <description>Have a look at BY GROUP PROCESSING in the SAS.COM support site.&lt;BR /&gt;
&lt;BR /&gt;
Scott Barry&lt;BR /&gt;
SBBWorks, Inc.&lt;BR /&gt;
&lt;BR /&gt;
Suggested Google advanced search argument, this topic / post:&lt;BR /&gt;
&lt;BR /&gt;
data step by group processing site:sas.com</description>
    <pubDate>Tue, 14 Jun 2011 10:42:55 GMT</pubDate>
    <dc:creator>sbb</dc:creator>
    <dc:date>2011-06-14T10:42:55Z</dc:date>
    <item>
      <title>Dates and Merging.</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Dates-and-Merging/m-p/9918#M836</link>
      <description>Hello, &lt;BR /&gt;
&lt;BR /&gt;
I've got two data sets. &lt;BR /&gt;
&lt;BR /&gt;
I'm trying to get these...&lt;BR /&gt;
&lt;BR /&gt;
id,StartDate, Value,&lt;BR /&gt;
001,01JUL2010, 6050&lt;BR /&gt;
002,27MAY2010, 9050 &lt;BR /&gt;
003, 11DEC2010, 16500&lt;BR /&gt;
&lt;BR /&gt;
into...&lt;BR /&gt;
&lt;BR /&gt;
id, StartDate, EndDate, OtherValue&lt;BR /&gt;
001, 01SEP2009,17DEC2009, 7500&lt;BR /&gt;
001, 18DEC2009,29APR2010, 7000&lt;BR /&gt;
001, 30APR2010,31DEC9999, 6000&lt;BR /&gt;
002, 10MAY2010,16AUG2010, 11000&lt;BR /&gt;
002, 17AUG2010,26NOV2010,10500&lt;BR /&gt;
002, 27NOV2010,11APR2011,9500&lt;BR /&gt;
002, 12APR2011,31DEC9999,9000&lt;BR /&gt;
003, 01DEC2010,12JAN2011,20000&lt;BR /&gt;
003, 13JAN2011, 25MAY2011,19500&lt;BR /&gt;
003, 26MAY2011, 31DEC9999, 18990&lt;BR /&gt;
&lt;BR /&gt;
and return the other value that would have been during that time...&lt;BR /&gt;
&lt;BR /&gt;
id, StartDate, Value, OtherValue&lt;BR /&gt;
001,01JUL2010, 6050, 6000&lt;BR /&gt;
002,27MAY2010, 9050, 11000&lt;BR /&gt;
003,11DEC2010, 16500, 20000&lt;BR /&gt;
&lt;BR /&gt;
So I have a variable to merge on but the startdate variable in the first set needs to read in between the start and end date in the second one and then output the other value variable. &lt;BR /&gt;
&lt;BR /&gt;
I'm not very experienced with any merging outside of indexing... &lt;BR /&gt;
&lt;BR /&gt;
data name (index=(&lt;VARIABLE&gt;));&lt;BR /&gt;
set sample; &lt;BR /&gt;
set sample2 key=&lt;VARIABLE&gt;/unique;&lt;BR /&gt;
if _error_ then delete (or output...);&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
Thanks in advance.&lt;/VARIABLE&gt;&lt;/VARIABLE&gt;</description>
      <pubDate>Thu, 09 Jun 2011 05:46:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Dates-and-Merging/m-p/9918#M836</guid>
      <dc:creator>SamT</dc:creator>
      <dc:date>2011-06-09T05:46:16Z</dc:date>
    </item>
    <item>
      <title>Re: Dates and Merging.</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Dates-and-Merging/m-p/9919#M837</link>
      <description>Or If you like, Hash Table also do it with more efficient.&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
[pre]&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
data temp1;&lt;BR /&gt;
infile datalines dlm=' ,';&lt;BR /&gt;
input id $ StartDate : date9. Value;&lt;BR /&gt;
format StartDate date9.;&lt;BR /&gt;
datalines;&lt;BR /&gt;
001,01JUL2010, 6050&lt;BR /&gt;
002,27MAY2010, 9050&lt;BR /&gt;
003, 11DEC2010, 16500&lt;BR /&gt;
;&lt;BR /&gt;
run;&lt;BR /&gt;
data temp2;&lt;BR /&gt;
infile datalines dlm=' ,';&lt;BR /&gt;
input id $ StartDate : date9. EndDate : date9. OtherValue;&lt;BR /&gt;
datalines;&lt;BR /&gt;
001, 01SEP2009,17DEC2009, 7500&lt;BR /&gt;
001, 18DEC2009,29APR2010, 7000&lt;BR /&gt;
001, 30APR2010,31DEC9999, 6000&lt;BR /&gt;
002, 10MAY2010,16AUG2010, 11000&lt;BR /&gt;
002, 17AUG2010,26NOV2010,10500&lt;BR /&gt;
002, 27NOV2010,11APR2011,9500&lt;BR /&gt;
002, 12APR2011,31DEC9999,9000&lt;BR /&gt;
003, 01DEC2010,12JAN2011,20000&lt;BR /&gt;
003, 13JAN2011, 25MAY2011,19500&lt;BR /&gt;
003, 26MAY2011, 31DEC9999, 18990&lt;BR /&gt;
;&lt;BR /&gt;
run;&lt;BR /&gt;
data want(drop=_StartDate EndDate);&lt;BR /&gt;
 merge temp1 temp2(rename=(StartDate=_StartDate));&lt;BR /&gt;
 by id;&lt;BR /&gt;
 if StartDate ge _StartDate and StartDate le EndDate  ;&lt;BR /&gt;
run;&lt;BR /&gt;
[/pre]&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
Ksharp</description>
      <pubDate>Thu, 09 Jun 2011 06:46:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Dates-and-Merging/m-p/9919#M837</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2011-06-09T06:46:52Z</dc:date>
    </item>
    <item>
      <title>Re: Dates and Merging.</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Dates-and-Merging/m-p/9920#M838</link>
      <description>Hi KSharp, &lt;BR /&gt;
&lt;BR /&gt;
Ran you practise script and it worked a treat, applied it to what I was working on and it produced some kind of monster. &lt;BR /&gt;
&lt;BR /&gt;
Reason I suspect i'm not obtaining the desired results would be that the ID number in the first set can appear multiple times, with a unique StartDate. &lt;BR /&gt;
&lt;BR /&gt;
What other methods are there of merging?</description>
      <pubDate>Tue, 14 Jun 2011 03:41:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Dates-and-Merging/m-p/9920#M838</guid>
      <dc:creator>SamT</dc:creator>
      <dc:date>2011-06-14T03:41:13Z</dc:date>
    </item>
    <item>
      <title>Re: Dates and Merging.</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Dates-and-Merging/m-p/9921#M839</link>
      <description>Have a look at BY GROUP PROCESSING in the SAS.COM support site.&lt;BR /&gt;
&lt;BR /&gt;
Scott Barry&lt;BR /&gt;
SBBWorks, Inc.&lt;BR /&gt;
&lt;BR /&gt;
Suggested Google advanced search argument, this topic / post:&lt;BR /&gt;
&lt;BR /&gt;
data step by group processing site:sas.com</description>
      <pubDate>Tue, 14 Jun 2011 10:42:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Dates-and-Merging/m-p/9921#M839</guid>
      <dc:creator>sbb</dc:creator>
      <dc:date>2011-06-14T10:42:55Z</dc:date>
    </item>
    <item>
      <title>Re: Dates and Merging.</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Dates-and-Merging/m-p/9922#M840</link>
      <description>I think KSharp's method requires that the ID in table temp1 is unique, otherwise a merge would start producing weird results.&lt;BR /&gt;
&lt;BR /&gt;
I have recently had a similar issue in joining up large time series data.  I found the notes on Fuzzy Merges in &lt;A href="http://www2.sas.com/proceedings/sugi24/Advtutor/p46-24.pdf" target="_blank"&gt;http://www2.sas.com/proceedings/sugi24/Advtutor/p46-24.pdf&lt;/A&gt; to be quite useful in working out a solution.&lt;BR /&gt;
&lt;BR /&gt;
Assuming that your "lookup" table - Temp2 in KSharp's example code - has no overlapping time ranges, then this should work.  I used this method over something like an SQL join because both of my datasets are huge (about 5 million in Temp1 and about 100 million in Temp2).&lt;BR /&gt;
&lt;BR /&gt;
By using an interleave, I reckon you're only really doing the equivalent of processing both data sets in full, with a single parse.&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
proc sort data=temp1;&lt;BR /&gt;
by id startdate;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
proc sort data=temp2;&lt;BR /&gt;
by id startdate;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
data want (drop=EndDate OtherValue);&lt;BR /&gt;
set&lt;BR /&gt;
temp2 (in=T2)&lt;BR /&gt;
temp1 (in=T1);&lt;BR /&gt;
by id startdate;&lt;BR /&gt;
&lt;BR /&gt;
format Last_StartDate date9.;&lt;BR /&gt;
format Last_EndDate date9.;&lt;BR /&gt;
format Last_OtherValue best12.0;&lt;BR /&gt;
&lt;BR /&gt;
retain Last_StartDate;&lt;BR /&gt;
retain Last_EndDate;&lt;BR /&gt;
retain Last_OtherValue;&lt;BR /&gt;
&lt;BR /&gt;
/*if lookup dataset, then copy the values*/&lt;BR /&gt;
if t2 then do;&lt;BR /&gt;
    Last_StartDate = StartDate;&lt;BR /&gt;
    Last_EndDate = EndDate;&lt;BR /&gt;
    Last_OtherValue = OtherValue;&lt;BR /&gt;
end;&lt;BR /&gt;
/*if the first instance of ID is in table 1, or T2's range has been past, then clear the values*/&lt;BR /&gt;
if (t1 and (first.ID or Last_EndDate &amp;lt; StartDate)) then do;&lt;BR /&gt;
    Last_StartDate = .;&lt;BR /&gt;
    Last_EndDate = .;&lt;BR /&gt;
    Last_OtherValue = .;&lt;BR /&gt;
end;&lt;BR /&gt;
&lt;BR /&gt;
if t1 then output;&lt;BR /&gt;
&lt;BR /&gt;
run;</description>
      <pubDate>Tue, 14 Jun 2011 16:05:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Dates-and-Merging/m-p/9922#M840</guid>
      <dc:creator>DF</dc:creator>
      <dc:date>2011-06-14T16:05:07Z</dc:date>
    </item>
    <item>
      <title>Re: Dates and Merging.</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Dates-and-Merging/m-p/9923#M841</link>
      <description>Yes. I suppose there is unique id in temp1.&lt;BR /&gt;
If you have duplicate ids ,then sql 's Cartesian Product is good choice.&lt;BR /&gt;
[pre]&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
data temp1;&lt;BR /&gt;
infile datalines dlm=' ,';&lt;BR /&gt;
input id $ StartDate : date9. Value;&lt;BR /&gt;
format StartDate date9.;&lt;BR /&gt;
datalines;&lt;BR /&gt;
001,01JUL2010, 6050&lt;BR /&gt;
001,01OCT2009, 6050&lt;BR /&gt;
002,27MAY2010, 9050&lt;BR /&gt;
003, 11DEC2010, 16500&lt;BR /&gt;
;&lt;BR /&gt;
run;&lt;BR /&gt;
data temp2;&lt;BR /&gt;
infile datalines dlm=' ,';&lt;BR /&gt;
input id $ StartDate : date9. EndDate : date9. OtherValue;&lt;BR /&gt;
datalines;&lt;BR /&gt;
001, 01SEP2009,17DEC2009, 7500&lt;BR /&gt;
001, 18DEC2009,29APR2010, 7000&lt;BR /&gt;
001, 30APR2010,31DEC2010, 6000&lt;BR /&gt;
002, 10MAY2010,16AUG2010, 11000&lt;BR /&gt;
002, 17AUG2010,26NOV2010,10500&lt;BR /&gt;
002, 27NOV2010,11APR2011,9500&lt;BR /&gt;
002, 12APR2011,31DEC2010,9000&lt;BR /&gt;
003, 01DEC2010,12JAN2011,20000&lt;BR /&gt;
003, 13JAN2011, 25MAY2011,19500&lt;BR /&gt;
003, 26MAY2011, 31DEC9999, 18990&lt;BR /&gt;
;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
proc sql;&lt;BR /&gt;
 create table want as&lt;BR /&gt;
  select a.id as id,b.startdate as startdate,value,othervalue&lt;BR /&gt;
   from temp2 as a,temp1 as b&lt;BR /&gt;
    where a.id = b.id and b.startdate between a.startdate and a.enddate&lt;BR /&gt;
     ;&lt;BR /&gt;
     quit;&lt;BR /&gt;
[/pre]&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
Ksharp</description>
      <pubDate>Thu, 16 Jun 2011 03:13:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Dates-and-Merging/m-p/9923#M841</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2011-06-16T03:13:31Z</dc:date>
    </item>
  </channel>
</rss>

