- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi SAS users,
I am trying to create Table using PROC SQL connecting to Hadoop to pull the data from hive tables. I am getting memory issue and i have asked to SET mapreduce.map.memory.mb=6000; in my SAS code. I tried adding it as option and it is erroring .
Can anyone suggest me where and how to use memory setting for Hive data pull's.
CONNECT TO HADOOP (user=&user. password=&password. server="XXXX" port=10039 subprotocol=hive2 DBMAX_TEXT=80
30 ! mapreduce.map.memory.mb=6000 );
ERROR: Invalid option name mapreduce.map.memory.mb.
Thanks,
Ana
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I suspect you need to use the PROPERTIES = option like properties = 'MyProperties' as in this link:
While the link is for the LIBNAME statement you could try it in a CONNECT statement.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
SASKiwi is correct. You'll need to add the mapreduce option using PROPERTIES. See example below:
PROPERTIES="hive.fetch.task.conversion=minimal;hive.fetch.task.conversion .threshold=-1";
i.e.
proc sql;
connect to hadoop ......
PROPERTIES="hive.groupby.orderby.position.alias=true");
(
select X_FACILITY_OFFER_CD,
count(*) as count, sum(X_FACILITY_OFFERED_AMT) as X_FACILITY_OFFERED_AMT_SUM, sum(X_FACILITY_OFFER_SEQ_NO) as X_FACILITY_OFFER_SEQ_NO_SUM
from UC5_TEST3A group by 1 order by 1
);
disconnect from hadoop;
quit;
The PROPERTIES option can be added on either the libname statement or Hadoop connection string in Explicit pass-through, as you used in your example.
Also, depending on how the Hadoop environment has been set up, altering the memory for a map task via SAS code may not change the memory for the map task. This could be locked down by the Hadoop administrator.