<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Select the observations falling in the predefined date range from three different datasets in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Select-the-observations-falling-in-the-predefined-date-range/m-p/503396#M134506</link>
    <description>Thank you very much Frankie. Worked out well!</description>
    <pubDate>Thu, 11 Oct 2018 13:08:13 GMT</pubDate>
    <dc:creator>Cruise</dc:creator>
    <dc:date>2018-10-11T13:08:13Z</dc:date>
    <item>
      <title>Select the observations falling in the predefined date range from three different datasets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-the-observations-falling-in-the-predefined-date-range/m-p/503214#M134419</link>
      <description>&lt;P&gt;Hi SAS experts,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="xmsonormal"&gt;I have an ambulance data (ambualnce_date) which has the&amp;nbsp;admission date (amb_date) and insurance_id. Inpatient data has the date of hospitalization (inp_date) and insurance_id. I also have patient_id file which uniquely identifies patients. My actual data sets are large where patient_id data has about 100k while inpatient and ambulance data has&amp;nbsp;millions of observations for each claims. However, I'd like to solve the problem for only two cases and move forward.&amp;nbsp;&lt;/P&gt;
&lt;P class="xmsonormal"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="xmsonormal"&gt;My objective: Select the date range of 15 days before and after the date_daignosis. &lt;BR /&gt;Final dataset should include, for example, from 09/1/2011 through 09/30/2011 because &lt;BR /&gt;9/15/2011 is the mid-point. While excluding any dates fell out of this 30-day range.&lt;/P&gt;
&lt;P class="xmsonormal"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My original post said: same month and year however sep 1 and sep 30 would fall in the same &lt;BR /&gt;month and year as same as sep 1 and sep 2. So, maybe, using 15 days before and after the &lt;BR /&gt;mid-point which is date_diagnosis would make more sense.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Dataset wanted for final output:&lt;/P&gt;
&lt;TABLE width="427"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="31"&gt;id&lt;/TD&gt;
&lt;TD width="74"&gt;amb_date&lt;/TD&gt;
&lt;TD width="75"&gt;inp-date&lt;/TD&gt;
&lt;TD width="102"&gt;date_diagnosis&lt;/TD&gt;
&lt;TD width="71"&gt;nereast&lt;/TD&gt;
&lt;TD width="74"&gt;source&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;1&lt;/TD&gt;
&lt;TD&gt;9/15/2011&lt;/TD&gt;
&lt;TD&gt;9/15/2011&lt;/TD&gt;
&lt;TD&gt;9/15/2011&lt;/TD&gt;
&lt;TD&gt;9/15/2011&lt;/TD&gt;
&lt;TD&gt;both&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;1&lt;/TD&gt;
&lt;TD&gt;9/15/2011&lt;/TD&gt;
&lt;TD&gt;9/1/2011&lt;/TD&gt;
&lt;TD&gt;9/15/2011&lt;/TD&gt;
&lt;TD&gt;9/15/2011&lt;/TD&gt;
&lt;TD&gt;ambulance&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;1&lt;/TD&gt;
&lt;TD&gt;9/30/2011&lt;/TD&gt;
&lt;TD&gt;9/10/2011&lt;/TD&gt;
&lt;TD&gt;9/15/2011&lt;/TD&gt;
&lt;TD&gt;9/10/2011&lt;/TD&gt;
&lt;TD&gt;inpatient&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;1&lt;/TD&gt;
&lt;TD&gt;9/1/2011&lt;/TD&gt;
&lt;TD&gt;9/28/2011&lt;/TD&gt;
&lt;TD&gt;9/15/2011&lt;/TD&gt;
&lt;TD&gt;9/28/2011&lt;/TD&gt;
&lt;TD&gt;inpatient&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is the continuation of the post here:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://communities.sas.com/t5/SAS-Programming/Logical-decisions-based-on-the-before-and-after-certain-date/m-p/503146#M134400" target="_blank"&gt;https://communities.sas.com/t5/SAS-Programming/Logical-decisions-based-on-the-before-and-after-certain-date/m-p/503146#M134400&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;but with slight change in the condition set for the selection of the date range.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;The problem so far is solved with the condition meeting with same month and year in the link specified above thanks to&amp;nbsp;&lt;SPAN class=""&gt;&lt;A id="link_13" class="lia-link-navigation lia-page-link lia-user-name-link" href="https://communities.sas.com/t5/user/viewprofilepage/user-id/238387" target="_self"&gt;FrankieNg&lt;/A&gt;&amp;nbsp;and&amp;nbsp;&lt;SPAN class="login-bold"&gt;&lt;A id="link_19" class="lia-link-navigation lia-page-link lia-user-name-link" href="https://communities.sas.com/t5/user/viewprofilepage/user-id/77163" target="_self"&gt;Oligolas&lt;/A&gt;.&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&amp;nbsp;And I'll greatly appreciate if you help me figure the problem with the new condition for the date range. I was considering of using BETWEEN function in PROC SQL with the predefined date range (BETWEEN date_diagnosis-15 and date_diagnosis+15).&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any hints, pointers, ideas or direct help in the code will be appreciated.Thanks in advance.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data ambulance_date; 
input insurance_id amb_date;
cards;  
1 16576 
1 17616 
1 17542 
1 17814 
1 17575 
1 17541 
1 17822 
1 17703 
1 18184 
1 17532 
1 17972 
1 17935 
2 16989 
1 17265 
2 16997 
2 17008 
1 18231 
2 16997 
2 17000 
1 17056 
1 16831 
2 16997 
2 17008 
1 18162 
1 18024 
1 18276 
1 17773 
1 18299 
1 18387 
1 18609 
1 18486 
1 18486 
1 18486 
1 18486 
1 18571 
1 18488 
1 18585 
1 18601 
1 17527 
1 18722 
1 18907 
1 18869 
1 18808 
1 18760 
1 18760 
1 18760 
1 18821 
1 18903 
1 18954 
1 18136 
2 16974 
2 16974 
2 16974 
2 17010 
1 17036 
1 18455 
1 18956 
;

data inpatient_date; 
input insurance_id inp_date; 
cards;
1 16623 
1 16692 
1 16726 
1 17712 
1 16709 
1 17541 
1 17827 
1 18110 
1 17790 
2 16974 
2 16939 
1 16952 
2 16966 
1 17036 
2 16981 
2 16974 
2 16974 
1 17616 
1 17546 
1 17240 
1 17245 
1 17689 
2 16940 
1 17895 
2 17000 
1 17736 
1 17543 
1 17672 
1 18024 
1 18175 
1 17819 
1 18184 
1 18122 
1 18231 
1 17500 
1 18299 
2 16997 
2 16997 
2 16997 
2 17008 
2 16989 
1 18695 
1 18722 
1 18722 
1 18309 
1 18309 
1 18869 
1 18836 
1 18110 
1 18110 
1 18956 
1 18321 
1 18760 
1 18299 
1 18299 
1 18299 
1 18455 
1 18455 
1 18534 
1 18857 
1 18857 
1 18760 
1 18884 
1 17988 
1 17972 
1 18895
; 

data patient_id; 
input pat_id $ insurance_id date_death date_diagnosis;
cards;
PAT009 1 18983 18884 
PAT092 2 17061 16940 
;

PROC SQL;
CREATE TABLE PAT_AMB_INP AS SELECT
A.pat_id,
A.insurance_id,
A.date_death AS date_death FORMAT=DATE9.,
A.date_diagnosis AS date_diagnosis FORMAT=DATE9.,
B.amb_date AS amb_date FORMAT=DATE9.,
C.inp_date AS inp_date FORMAT=DATE9.,
MIN(amb_date,inp_date) AS NEAREST_DATE FORMAT=DATE9.	/* select the nearest of the ambulance_date and inp_date to the date_diagnosis */
FROM patient_id AS A
LEFT JOIN ambulance_date AS B ON A.insurance_id = B.insurance_id 
		  AND SUBSTR(PUT(A.date_diagnosis,DATE9.),3,9) = SUBSTR(PUT(B.amb_date,DATE9.),3,9)	/* find records in the ambulance data where admission date (ambulance_date) equals or in the same month and year with (date_diagnosis) */
		  AND A.date_diagnosis &amp;lt;= B.amb_date 													/* consider only if ambulance and inpatient dates happen after date-diagnosis */
LEFT JOIN inpatient_date AS C ON A.insurance_id = C.insurance_id 
		  AND SUBSTR(PUT(A.date_diagnosis,DATE9.),3,9) = SUBSTR(PUT(C.inp_date,DATE9.),3,9)	/* find records in the inpatient data where date of hospitalization to inpatient dept (inp_date) equals or in the same month and year with (date_diagnosis) */
		  AND A.date_diagnosis &amp;lt;= C.inp_date													/* consider only if ambulance and inpatient dates happen after date-diagnosis */
;
QUIT;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Oct 2018 21:43:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-the-observations-falling-in-the-predefined-date-range/m-p/503214#M134419</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2018-10-10T21:43:11Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL: using 'between' to select the observations falling in the predefined date range</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-the-observations-falling-in-the-predefined-date-range/m-p/503232#M134429</link>
      <description>&lt;P&gt;Your data appear to be sorted by id, but not date within id?&amp;nbsp; Is that true.&amp;nbsp; If your data are sorted by id then a data step solution may be a good deal faster than SQL.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Oct 2018 20:26:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-the-observations-falling-in-the-predefined-date-range/m-p/503232#M134429</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2018-10-10T20:26:18Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL: using 'between' to select the observations falling in the predefined date range</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-the-observations-falling-in-the-predefined-date-range/m-p/503234#M134430</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/31461"&gt;@mkeintz&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Individual patients have multiple insurance id. However, once datasets are merged on the insurance_id (which is only linkage variable common across these three different datasets) then the final linked data should be sorted by pat_id. So, you're right, not sorted by date within pat_id.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;do you have time to show me how in data step? maybe I should extend my sample datasets in this forum.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Oct 2018 20:41:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-the-observations-falling-in-the-predefined-date-range/m-p/503234#M134430</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2018-10-10T20:41:48Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL: using 'between' to select the observations falling in the predefined date range</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-the-observations-falling-in-the-predefined-date-range/m-p/503235#M134431</link>
      <description>&lt;P&gt;I will look at it later this evening if there is time then.&amp;nbsp; Busy now.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Oct 2018 20:41:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-the-observations-falling-in-the-predefined-date-range/m-p/503235#M134431</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2018-10-10T20:41:52Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL: using 'between' to select the observations falling in the predefined date range</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-the-observations-falling-in-the-predefined-date-range/m-p/503237#M134433</link>
      <description>Thanks a lot. I'm updating my post with the larger sample datasets right now.</description>
      <pubDate>Wed, 10 Oct 2018 20:50:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-the-observations-falling-in-the-predefined-date-range/m-p/503237#M134433</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2018-10-10T20:50:45Z</dc:date>
    </item>
    <item>
      <title>Re: Select the observations falling in the predefined date range from three different datasets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-the-observations-falling-in-the-predefined-date-range/m-p/503298#M134464</link>
      <description>&lt;P&gt;Hi Cruise,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When come to date range, we usually use INTNX function.&lt;/P&gt;&lt;P&gt;I've modified a bit of the data source, to show more observation from the output..&lt;/P&gt;&lt;P&gt;Hope this help.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data ambulance_date; 
input insurance_id amb_date;
cards;  
1 16576 
1 17616 
1 17542 
1 17814 
1 17575 
1 17541 
1 17822 
1 17703 
1 18184 
1 17532 
1 17972 
1 17935 
2 16989 
1 17265 
2 16997 
2 17008 
1 18231 
2 16997 
2 17000 
1 17056 
1 16831 
2 16940 
2 17008 
1 18162 
1 18024 
1 18276 
1 17773 
1 18299 
1 18387 
1 18609 
1 18486 
1 18486 
1 18486 
1 18486 
1 18571 
1 18488 
1 18585 
1 18601 
1 17527 
1 18722 
1 18907 
1 18869 
1 18808 
1 18760 
1 18760 
1 18760 
1 18821 
1 18903 
1 18954 
1 18136 
2 16925 
2 16974 
2 16955 
2 17010 
1 17036 
1 18455 
1 18956 
;

data inpatient_date; 
input insurance_id inp_date; 
cards;
1 16623 
1 16692 
1 16726 
1 17712 
1 16709 
1 17541 
1 17827 
1 18110 
1 17790 
2 16974 
2 16939 
1 16952 
2 16950 
1 17036 
2 16981 
2 16974 
2 16930 
1 17616 
1 17546 
1 17240 
1 17245 
1 17689 
2 16940 
1 17895 
2 17000 
1 17736 
1 17543 
1 17672 
1 18024 
1 18175 
1 17819 
1 18184 
1 18122 
1 18231 
1 17500 
1 18299 
2 16997 
2 16997 
2 16997 
2 17008 
2 16939 
1 18695 
1 18722 
1 18722 
1 18309 
1 18309 
1 18869 
1 18836 
1 18110 
1 18110 
1 18956 
1 18321 
1 18760 
1 18299 
1 18299 
1 18299 
1 18455 
1 18455 
1 18534 
1 18857 
1 18857 
1 18760 
1 18884 
1 17988 
1 17972 
1 18895
; 

data patient_id; 
input pat_id $ insurance_id date_death date_diagnosis;
cards;
PAT009 1 18983 18884 
PAT092 2 17061 16940 
;




%LET RANGE = 15;

PROC SQL;
CREATE TABLE PAT_AMB_INP AS SELECT
A.pat_id,
A.insurance_id,
A.date_death AS date_death FORMAT=DATE9.,
A.date_diagnosis AS date_diagnosis FORMAT=DATE9.,
B.amb_date AS amb_date FORMAT=DATE9.,
C.inp_date AS inp_date FORMAT=DATE9.,
ABS(INTCK('DAY', date_diagnosis, amb_date)) AS AMB_RANGE,
ABS(INTCK('DAY', date_diagnosis, inp_date)) AS INP_RANGE,
CASE WHEN CALCULATED AMB_RANGE = CALCULATED INP_RANGE THEN amb_date
	 WHEN CALCULATED AMB_RANGE &amp;gt; CALCULATED INP_RANGE THEN inp_date
	 WHEN CALCULATED AMB_RANGE &amp;lt; CALCULATED INP_RANGE THEN amb_date
	 END AS NEAREST_DATE FORMAT DATE9.,
CASE WHEN CALCULATED AMB_RANGE = CALCULATED INP_RANGE THEN 'BOTH'
	 WHEN CALCULATED AMB_RANGE &amp;gt; CALCULATED INP_RANGE THEN 'INPATIENT'
	 WHEN CALCULATED AMB_RANGE &amp;lt; CALCULATED INP_RANGE THEN 'AMBULANCE'
	 END AS SOURCE FORMAT $15.
FROM patient_id AS A
LEFT JOIN ambulance_date AS B ON A.insurance_id = B.insurance_id 
		  AND INTNX('DAY', A.date_diagnosis, -&amp;amp;RANGE) &amp;lt;= B.amb_date &amp;lt;= INTNX('DAY', A.date_diagnosis, &amp;amp;RANGE)	/* predefined date range (BETWEEN date_diagnosis-15 and date_diagnosis+15) */
LEFT JOIN inpatient_date AS C ON A.insurance_id = C.insurance_id 
		  AND INTNX('DAY', A.date_diagnosis, -&amp;amp;RANGE) &amp;lt;= C.inp_date &amp;lt;= INTNX('DAY', A.date_diagnosis, &amp;amp;RANGE)	/* predefined date range (BETWEEN date_diagnosis-15 and date_diagnosis+15) */
;
QUIT;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 11 Oct 2018 02:59:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-the-observations-falling-in-the-predefined-date-range/m-p/503298#M134464</guid>
      <dc:creator>FrankieNg</dc:creator>
      <dc:date>2018-10-11T02:59:13Z</dc:date>
    </item>
    <item>
      <title>Re: Select the observations falling in the predefined date range from three different datasets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-the-observations-falling-in-the-predefined-date-range/m-p/503396#M134506</link>
      <description>Thank you very much Frankie. Worked out well!</description>
      <pubDate>Thu, 11 Oct 2018 13:08:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-the-observations-falling-in-the-predefined-date-range/m-p/503396#M134506</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2018-10-11T13:08:13Z</dc:date>
    </item>
  </channel>
</rss>

