<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Matching cases and controls using PROC SQL in SAS Health and Life Sciences</title>
    <link>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Matching-cases-and-controls-using-PROC-SQL/m-p/261991#M2060</link>
    <description>Perhaps you could try select distinct. &lt;BR /&gt;It would probably use as much resources as the following proc sort does, but would limit the output from he SQL if I interpret your scenario correctly.</description>
    <pubDate>Thu, 07 Apr 2016 04:30:44 GMT</pubDate>
    <dc:creator>LinusH</dc:creator>
    <dc:date>2016-04-07T04:30:44Z</dc:date>
    <item>
      <title>Matching cases and controls using PROC SQL</title>
      <link>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Matching-cases-and-controls-using-PROC-SQL/m-p/261542#M2059</link>
      <description>&lt;P&gt;Hi,&lt;BR /&gt;I've been working on case-control syntax and tried using the SAS PDF found here: &lt;A href="http://www2.sas.com/proceedings/sugi29/173-29.pdf" target="_blank"&gt;http://www2.sas.com/proceedings/sugi29/173-29.pdf&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Unfortunately my size is vastly larger (~170000 cases and more than 3 million controls) and this syntax is creating a file size much larger than necessary. I'm hoping to use a non-macro based syntax since my SAS has trouble running macros.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'd like to use some sort of PROC SQL syntax, but I really don't know how to make it more efficient in the beginning and match controls to cases right away without creating a ton of duplicates per each case. Any suggestions or ideas? Anything would be immensely helpful at this point. Thank you!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Syntax:&lt;/P&gt;&lt;P&gt;PROC SQL;&lt;/P&gt;&lt;P&gt;CREATE table controls_id&lt;/P&gt;&lt;P&gt;as select&lt;/P&gt;&lt;P&gt;one.ID as&lt;/P&gt;&lt;P&gt;study_id,&lt;/P&gt;&lt;P&gt;two.ID as control_id,&lt;/P&gt;&lt;P&gt;one.age as study_age,&lt;/P&gt;&lt;P&gt;two.age as control_age,&lt;/P&gt;&lt;P&gt;one.race as study_race,&lt;/P&gt;&lt;P&gt;two.race as control_race,&lt;/P&gt;&lt;P&gt;one.rand_num as rand_num&lt;/P&gt;&lt;P&gt;from study one, control two&lt;/P&gt;&lt;P&gt;where (one.age=two.age and&lt;/P&gt;&lt;P&gt;one.race=two.race);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;* Remove duplicate control subjects;&lt;/P&gt;&lt;P&gt;proc sort data=controls_id nodupkey;&lt;/P&gt;&lt;P&gt;by control_id rand_num;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;*exactly match on variables with fixed number of controls;&lt;/P&gt;&lt;P&gt;proc sort data=controls_id ;&lt;/P&gt;&lt;P&gt;by study_id rand_num;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;data controls_id2 not_enough;&lt;/P&gt;&lt;P&gt;set controls_id;&lt;/P&gt;&lt;P&gt;by study_id ;&lt;/P&gt;&lt;P&gt;retain num;&lt;/P&gt;&lt;P&gt;if first.study_id then num=1;&lt;/P&gt;&lt;P&gt;if num le 2 then do;&lt;/P&gt;&lt;P&gt;output controls_id2;&lt;/P&gt;&lt;P&gt;num=num+1;&lt;/P&gt;&lt;P&gt;end;&lt;/P&gt;&lt;P&gt;if last.study_id then do;&lt;/P&gt;&lt;P&gt;if num le 2 then output not_enough;&lt;/P&gt;&lt;P&gt;end;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;proc print data=controls_id2&lt;/P&gt;&lt;P&gt;(obs=40)&lt;/P&gt;&lt;P&gt;;&lt;/P&gt;&lt;P&gt;title2 'matched patients';&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;*use following syntax to remove cases that do not have two controls;&lt;/P&gt;&lt;P&gt;data controls_id3;&lt;/P&gt;&lt;P&gt;merge controls_id2&lt;/P&gt;&lt;P&gt;not_enough(in=b_);&lt;/P&gt;&lt;P&gt;by study_id;&lt;/P&gt;&lt;P&gt;if b_ then delete;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Apr 2016 21:29:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Matching-cases-and-controls-using-PROC-SQL/m-p/261542#M2059</guid>
      <dc:creator>lim_6</dc:creator>
      <dc:date>2016-04-05T21:29:33Z</dc:date>
    </item>
    <item>
      <title>Re: Matching cases and controls using PROC SQL</title>
      <link>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Matching-cases-and-controls-using-PROC-SQL/m-p/261991#M2060</link>
      <description>Perhaps you could try select distinct. &lt;BR /&gt;It would probably use as much resources as the following proc sort does, but would limit the output from he SQL if I interpret your scenario correctly.</description>
      <pubDate>Thu, 07 Apr 2016 04:30:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Matching-cases-and-controls-using-PROC-SQL/m-p/261991#M2060</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2016-04-07T04:30:44Z</dc:date>
    </item>
    <item>
      <title>Re: Matching cases and controls using PROC SQL</title>
      <link>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Matching-cases-and-controls-using-PROC-SQL/m-p/263878#M2063</link>
      <description>&lt;P&gt;Consider doing propensity score matching.&amp;nbsp; This paper has some nice optimization techniques&lt;/P&gt;&lt;P&gt;&lt;A href="http://www.ucdenver.edu/academics/colleges/PublicHealth/resourcesfor/Faculty/perraillon/code/Pages/propensity-scores-sas.aspx" target="_blank"&gt;http://www.ucdenver.edu/academics/colleges/PublicHealth/resourcesfor/Faculty/perraillon/code/Pages/propensity-scores-sas.aspx&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 14 Apr 2016 13:56:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Matching-cases-and-controls-using-PROC-SQL/m-p/263878#M2063</guid>
      <dc:creator>Doc_Duke</dc:creator>
      <dc:date>2016-04-14T13:56:06Z</dc:date>
    </item>
  </channel>
</rss>

