Hi,
I am using EG 7.1 and have a URI connection to a Hadoop environment. I am trying to load a CSV file that I uploaded to HDFS through Ambari. I am trying to run the code below, it will run with no errors, but I don't see the table in my directory. I know how to load using the proc hadoop procedure but since i am utilizing the URI through libraries I thought I could bypass this. One thing I noticed is that when I run this in DBVISUALIZER I always set the directory initially with a 'use' statement. What would be the equivalent of this with a SAS/ACESS URI connection? I am wondering if my problem is that it doesn't know what database to put it to.
Any help would be appreciated.
Thanks,
Dan
proc sql;
execute (create table bill_component_test(
DIST_ID string,
bill_component string,
statistics_cd string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES ("escapeChar"="\\",
"quoteChar"="'"
"separatorChar"=",")
STORED AS INPUTFORMAT
"org.apache.hadoop.mapred.TextInputFormat"
OUTPUTFORMAT
"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"
LOCATION "hdfs://MSNHDPPROD1/data/work/departmental/revenue_forecasting/bill_component");
quit;
I reckon you also need a connection name.
proc sql;
connect ... as HIVE;
execute by HIVE ( ... );
quit;
Thanks for the reply, I am working on this now with our admins.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.