Hi,
I am using EG 7.1 and have a URI connection to a Hadoop environment. I am trying to load a CSV file that I uploaded to HDFS through Ambari. I am trying to run the code below, it will run with no errors, but I don't see the table in my directory. I know how to load using the proc hadoop procedure but since i am utilizing the URI through libraries I thought I could bypass this. One thing I noticed is that when I run this in DBVISUALIZER I always set the directory initially with a 'use' statement. What would be the equivalent of this with a SAS/ACESS URI connection? I am wondering if my problem is that it doesn't know what database to put it to.
Any help would be appreciated.
Thanks,
Dan
proc sql;
execute (create table bill_component_test(
DIST_ID string,
bill_component string,
statistics_cd string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES ("escapeChar"="\\",
"quoteChar"="'"
"separatorChar"=",")
STORED AS INPUTFORMAT
"org.apache.hadoop.mapred.TextInputFormat"
OUTPUTFORMAT
"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"
LOCATION "hdfs://MSNHDPPROD1/data/work/departmental/revenue_forecasting/bill_component");
quit;
I reckon you also need a connection name.
proc sql;
connect ... as HIVE;
execute by HIVE ( ... );
quit;
Thanks for the reply, I am working on this now with our admins.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.