<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Writing data to Hadoop by using SAS libname(Performance issue) in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Writing-data-to-Hadoop-by-using-SAS-libname-Performance-issue/m-p/403859#M278955</link>
    <description>&lt;P&gt;Hello All,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am connecting to hadoop &amp;amp; writing a SAS dataset to hadoop using a libname statement.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;libname hdptgt hadoop server=&amp;amp;server port=10000 schema=sample config="&amp;amp;hadoop_config_file"; /*parameters passed from unix*/&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/** sas code **/&lt;/P&gt;&lt;P&gt;data&amp;nbsp;&lt;SPAN&gt;hdptgt.main_table;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;merge main_table sub_table;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;by rec_id;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;run;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Log resolution:-&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;NOTE: There were 290000000 observations read from the data set WORK.MAIN_TABLE.&lt;BR /&gt;NOTE: There were 10000000 observations read from the data set WORK.SUB_TABLE.&lt;BR /&gt;NOTE: The data set HDP.MAIN_TABLE&amp;nbsp;has 290000000 observations and 50 variables.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;real time&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;8:30:04.19&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;cpu time&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 34:31.04&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;This takes around 8 hrs 30 mins. Is there anything i could do to run this fast ? any help would be appreciated.&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 13 Oct 2017 10:56:50 GMT</pubDate>
    <dc:creator>GunnerEP</dc:creator>
    <dc:date>2017-10-13T10:56:50Z</dc:date>
    <item>
      <title>Writing data to Hadoop by using SAS libname(Performance issue)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Writing-data-to-Hadoop-by-using-SAS-libname-Performance-issue/m-p/403859#M278955</link>
      <description>&lt;P&gt;Hello All,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am connecting to hadoop &amp;amp; writing a SAS dataset to hadoop using a libname statement.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;libname hdptgt hadoop server=&amp;amp;server port=10000 schema=sample config="&amp;amp;hadoop_config_file"; /*parameters passed from unix*/&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/** sas code **/&lt;/P&gt;&lt;P&gt;data&amp;nbsp;&lt;SPAN&gt;hdptgt.main_table;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;merge main_table sub_table;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;by rec_id;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;run;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Log resolution:-&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;NOTE: There were 290000000 observations read from the data set WORK.MAIN_TABLE.&lt;BR /&gt;NOTE: There were 10000000 observations read from the data set WORK.SUB_TABLE.&lt;BR /&gt;NOTE: The data set HDP.MAIN_TABLE&amp;nbsp;has 290000000 observations and 50 variables.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;real time&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;8:30:04.19&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;cpu time&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 34:31.04&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;This takes around 8 hrs 30 mins. Is there anything i could do to run this fast ? any help would be appreciated.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 13 Oct 2017 10:56:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Writing-data-to-Hadoop-by-using-SAS-libname-Performance-issue/m-p/403859#M278955</guid>
      <dc:creator>GunnerEP</dc:creator>
      <dc:date>2017-10-13T10:56:50Z</dc:date>
    </item>
    <item>
      <title>Re: Writing data to Hadoop by using SAS libname(Performance issue)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Writing-data-to-Hadoop-by-using-SAS-libname-Performance-issue/m-p/403862#M278956</link>
      <description>&lt;P&gt;Typically, on would use BULKLOAD to speed up RDBMS write operations.&lt;/P&gt;
&lt;P&gt;Unfortunately, for Hive this is just a syntax support, there is the same underlying process that is used.&lt;/P&gt;
&lt;P&gt;I would start with adding&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;options msglevel=i sastrace=',,,d' sastraceloc=saslog nostsuffix;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;to your program to better analyze what's going on on the Hive side.&lt;/P&gt;
&lt;P&gt;Other than that, I think this is a matter of hdfs/Hive optimization issue (given that you can rule out network bottlenecks, or local SAS session ones during read/merge operation).&lt;/P&gt;</description>
      <pubDate>Fri, 13 Oct 2017 11:18:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Writing-data-to-Hadoop-by-using-SAS-libname-Performance-issue/m-p/403862#M278956</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2017-10-13T11:18:32Z</dc:date>
    </item>
    <item>
      <title>Re: Writing data to Hadoop by using SAS libname(Performance issue)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Writing-data-to-Hadoop-by-using-SAS-libname-Performance-issue/m-p/404873#M278957</link>
      <description>&lt;P&gt;Unfortunately that doesn't make much of a difference, found this as well.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="http://support.sas.com/documentation/cdl/en/acreldb/65247/HTML/default/viewer.htm#n0mnrn0q9n41atn194mujpi4zel9.htm" target="_blank"&gt;http://support.sas.com/documentation/cdl/en/acreldb/65247/HTML/default/viewer.htm#n0mnrn0q9n41atn194mujpi4zel9.htm&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 17 Oct 2017 16:35:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Writing-data-to-Hadoop-by-using-SAS-libname-Performance-issue/m-p/404873#M278957</guid>
      <dc:creator>GunnerEP</dc:creator>
      <dc:date>2017-10-17T16:35:08Z</dc:date>
    </item>
  </channel>
</rss>

