I used the following code to access data from hadoop. It took me 6 hrs for get 100000 records and 8K columns which seems very slow. Without options, it took 8 hrs. Can you please check and give suggestions? options SGIO=yes; options bufno=2000 bufsize=48K; Libname sastest 'E:\SASMA\SASUserData\User\krishnaramasamy\Hadoop data'; proc sql; connect to hadoop (user=%LOWCASE(&SYSUSERID.) password="XXXXX" server='YYYYYY' uri='jdbc:hive2://YYYYYYY.com:8443/default?hive.server2.transport.mode=http;hive.execution.engine=tez;hive.server2.thrift.http.path=gateway/hdpprod/hive;hive.execution.engine=tez' schema=ZZZZZ); create table sastest.test as select * from connection to hadoop ( select * from test limit 100000 ); disconnect from hadoop ; quit;
... View more