I would like to write a table in sas to hadoop. How do I write. My current code...
proc sql;
connect to hadoop (user=john password=xxxxx server="myhadoopmachine.public.com" port=10000 SCHEMA=hiveSchema subprotocol=hive2 dbmax_text=300 CFG="/opt/sas/myhadoopmachine/myhadoopmachine_core_hdfs_site.xml");
execute ( create table hivelib.mytable as
select *
from saslib.mytable
) by hadoop ;
results in error:
ERROR: Execute error: Error while compiling statement: FAILED: SemanticException [Error 10001]: Line 1:49 Table not found 'mytable'
The basic issue is that I can execute sql command on hive. The above expects the saslib.mytable to be on hadoop server under the schema saslib.mytable. saslib.mytable is my sas lib with table name mylib.
How do I pass this information in the above proc sql to copy the sas data set to hadoop.
Thanks,
John
John,
it doesn't look like your procedure is complete in your example. See this techniques in processing in Hadoop paper, it has a few examples using proc sql;
https://support.sas.com/resources/papers/proceedings14/SAS033-2014.pdf
It has this sample code to write to hadoop using proc sql that might help;
proc sql; connect to hadoop (server=duped user=myUserID);
execute (create table myUserID_store_cnt row format delimited fields terminated by '\001' stored as textfile as select customer_rk, count(*) as total_orders from order_fact group by customer_rk) by hadoop;
disconnect from hadoop;
quit;
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.