<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Using PROC FORMAT to help reduce run times when using a LEFT JOIN with much larger RIGHT TABLES. in SAS Software for Learning Community</title>
    <link>https://communities.sas.com/t5/SAS-Software-for-Learning/Using-PROC-FORMAT-to-help-reduce-run-times-when-using-a-LEFT/m-p/879190#M1237</link>
    <description>&lt;P&gt;I was reading a May 10&lt;SUP&gt;th&lt;/SUP&gt;, 2023 SAS-L&lt;SUP&gt;1&lt;/SUP&gt; email digest about issues with running mergers with very large data sets (or tables). An email was sent about an individual having long run times with mergers and hash joins and wanted to see how to speed up mergers taking 6 hours or more and HASH joins taking 8 Hours.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV&gt;From my experience, t&lt;SPAN&gt;here are many issues over the years where Data Mergers and/or SQL joins were taking too long usually; sometimes 2 or more days of sorting and merging 2&amp;nbsp;files.&amp;nbsp; My correction for the 2-day run using USER FORMATS resulted in a 10-minute&amp;nbsp;merge. A common issue in processing epidemiology and consumer data analysis. Using the USER FORMAT reads the larger file without having to sort it and creates a smaller large file that can then be sorted in quicker time.&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;I have posted a&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;LinkedIn post that showed&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;some examples of this issue and added a few references on PROC FORMAT:&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&lt;A title="Click to open in a new window or tab
https://jonasbilenascom.wpcomstaging.com/wp-content/uploads/2023/05/Using_SAS_User_Formats_for_Large_Data_Mergers.pdf" href="https://jonasbilenascom.wpcomstaging.com/wp-content/uploads/2023/05/Using_SAS_User_Formats_for_Large_Data_Mergers.pdf" target="_blank"&gt;https://jonasbilenascom.wpcomstaging.com/wp-content/uploads/2023/05/Using_SAS_User_Formats_for_Large_Data_Mergers.pdf&lt;/A&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sun, 04 Jun 2023 21:13:13 GMT</pubDate>
    <dc:creator>jbilenas</dc:creator>
    <dc:date>2023-06-04T21:13:13Z</dc:date>
    <item>
      <title>Using PROC FORMAT to help reduce run times when using a LEFT JOIN with much larger RIGHT TABLES.</title>
      <link>https://communities.sas.com/t5/SAS-Software-for-Learning/Using-PROC-FORMAT-to-help-reduce-run-times-when-using-a-LEFT/m-p/879190#M1237</link>
      <description>&lt;P&gt;I was reading a May 10&lt;SUP&gt;th&lt;/SUP&gt;, 2023 SAS-L&lt;SUP&gt;1&lt;/SUP&gt; email digest about issues with running mergers with very large data sets (or tables). An email was sent about an individual having long run times with mergers and hash joins and wanted to see how to speed up mergers taking 6 hours or more and HASH joins taking 8 Hours.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV&gt;From my experience, t&lt;SPAN&gt;here are many issues over the years where Data Mergers and/or SQL joins were taking too long usually; sometimes 2 or more days of sorting and merging 2&amp;nbsp;files.&amp;nbsp; My correction for the 2-day run using USER FORMATS resulted in a 10-minute&amp;nbsp;merge. A common issue in processing epidemiology and consumer data analysis. Using the USER FORMAT reads the larger file without having to sort it and creates a smaller large file that can then be sorted in quicker time.&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;I have posted a&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;LinkedIn post that showed&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;some examples of this issue and added a few references on PROC FORMAT:&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&lt;A title="Click to open in a new window or tab
https://jonasbilenascom.wpcomstaging.com/wp-content/uploads/2023/05/Using_SAS_User_Formats_for_Large_Data_Mergers.pdf" href="https://jonasbilenascom.wpcomstaging.com/wp-content/uploads/2023/05/Using_SAS_User_Formats_for_Large_Data_Mergers.pdf" target="_blank"&gt;https://jonasbilenascom.wpcomstaging.com/wp-content/uploads/2023/05/Using_SAS_User_Formats_for_Large_Data_Mergers.pdf&lt;/A&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 04 Jun 2023 21:13:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Software-for-Learning/Using-PROC-FORMAT-to-help-reduce-run-times-when-using-a-LEFT/m-p/879190#M1237</guid>
      <dc:creator>jbilenas</dc:creator>
      <dc:date>2023-06-04T21:13:13Z</dc:date>
    </item>
    <item>
      <title>Re: Using PROC FORMAT to help reduce run times when using a LEFT JOIN with much larger RIGHT TABLES.</title>
      <link>https://communities.sas.com/t5/SAS-Software-for-Learning/Using-PROC-FORMAT-to-help-reduce-run-times-when-using-a-LEFT/m-p/879193#M1238</link>
      <description>&lt;P&gt;If the reason for the joins is a simple code/decode lookup then formats (or informats) should always be considered.&amp;nbsp; And if the mapping is many to one (for example age ranges) then formats should require less memory than a hash object that would required exact matches.&lt;/P&gt;</description>
      <pubDate>Sun, 04 Jun 2023 22:25:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Software-for-Learning/Using-PROC-FORMAT-to-help-reduce-run-times-when-using-a-LEFT/m-p/879193#M1238</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2023-06-04T22:25:22Z</dc:date>
    </item>
  </channel>
</rss>

