SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Memory setup for HIVE in SAS

Reply
Highlighted
Frequent Contributor
Posts: 124

Memory setup for HIVE in SAS

Hi SAS users,

 

I am trying to create Table using PROC SQL connecting to Hadoop  to pull the data from hive tables. I am getting memory issue and i  have asked to SET mapreduce.map.memory.mb=6000; in my SAS code. I tried adding it as option and it is erroring .

Can anyone suggest me where and how to use memory setting for Hive data pull's.

 

CONNECT TO HADOOP (user=&user. password=&password. server="XXXX" port=10039 subprotocol=hive2 DBMAX_TEXT=80
30       ! mapreduce.map.memory.mb=6000 );
ERROR: Invalid option name mapreduce.map.memory.mb.

 

Thanks,

Ana

Super User
Posts: 3,927

Re: Memory setup for HIVE in SAS

I suspect you need to use the PROPERTIES = option like properties = 'MyProperties' as in this link:

http://documentation.sas.com/?docsetId=acreldb&docsetTarget=p0ly2onqqpbys8n1j9lra8qa6q20.htm&docsetV...

 

While the link is for the LIBNAME statement you could try it in a CONNECT statement.

Contributor
Posts: 48

Re: Memory setup for HIVE in SAS

SASKiwi is correct. You'll need to add the mapreduce option using PROPERTIES. See example below:

 

 

PROPERTIES="hive.fetch.task.conversion=minimal;hive.fetch.task.conversion .threshold=-1";

 

i.e. 

proc sql;
     connect to hadoop ......

        PROPERTIES="hive.groupby.orderby.position.alias=true");

        (
         select X_FACILITY_OFFER_CD,

                   count(*) as count, sum(X_FACILITY_OFFERED_AMT) as X_FACILITY_OFFERED_AMT_SUM,             sum(X_FACILITY_OFFER_SEQ_NO) as X_FACILITY_OFFER_SEQ_NO_SUM

from UC5_TEST3A group by 1 order by 1

);
disconnect from hadoop;

quit;

 

 

The PROPERTIES option can be added on either the libname statement or Hadoop connection string in Explicit pass-through, as you used in your example.

 

 

Also, depending on how the Hadoop environment has been set up, altering the memory for a map task via SAS code may not change the memory for the map task. This could be locked down by the Hadoop administrator. 

Ask a Question
Discussion stats
  • 2 replies
  • 181 views
  • 0 likes
  • 3 in conversation