Hi,
I am using EG 7.1 and have a URI connection to a Hadoop environment. I am trying to load a CSV file that I uploaded to HDFS through Ambari. I am trying to run the code below, it will run with no errors, but I don't see the table in my directory. I know how to load using the proc hadoop procedure but since i am utilizing the URI through libraries I thought I could bypass this. One thing I noticed is that when I run this in DBVISUALIZER I always set the directory initially with a 'use' statement. What would be the equivalent of this with a SAS/ACESS URI connection? I am wondering if my problem is that it doesn't know what database to put it to.
Any help would be appreciated.
Thanks,
Dan
proc sql;
execute (create table bill_component_test(
DIST_ID string,
bill_component string,
statistics_cd string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES ("escapeChar"="\\",
"quoteChar"="'"
"separatorChar"=",")
STORED AS INPUTFORMAT
"org.apache.hadoop.mapred.TextInputFormat"
OUTPUTFORMAT
"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"
LOCATION "hdfs://MSNHDPPROD1/data/work/departmental/revenue_forecasting/bill_component");
quit;
I reckon you also need a connection name.
proc sql;
connect ... as HIVE;
execute by HIVE ( ... );
quit;
Thanks for the reply, I am working on this now with our admins.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.