Hi,
I am trying to fill in the blanks on a use case and have some constraints that I'm facing.
> can't introduce any UDF not natively in hortonworks offering of hadoop
> cannot touch permanent data persistence solution within hadoop
We are currently storing some JSON objects in HBase with mapping through HIVE in order to enable the SAS connector to work directly with the data (reliant on hiveserver2). No problems with data access itself. The SAS limitiations on hive tables and proc sql make it that any return has to be below 32XXX characters which is not acceptable for this solution. The JSON reader for SAS can handle but will need to consume from a file. The SAS software site outside of the cluster and not directly on any nodes.
The code I have been using is pasted below. The code executes fine and creates a comma separated file to work with as far as the query goes. The reader engine I am using needs the entire file to conform to JSON standards which would mean adding an opening/closing [ ] to make it appear as a array type (SAS JSON Reader 9.4.4).
Is there a way to specify adding these characters in the initial write? If writing the file to HDFS how would I go about appending (or rather copying merging) in the process through the SAS Hadoop connector?
If not is there a good approach to maybe bringing the file local and adding the characters as needed for the reader?
Current:
{JSON Data},
{JSON Data}
Desired (not sensitive to line feeds from reader perspective):
[{JSON Data},
{JSON Data}]
proc sql;
connect to hadoop (XXXXXX - connection string information);
execute (INSERT OVERWRITE DIRECTORY '/home/bmiller'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
SELECT JSON from results r
inner join
hbaseresults hb on r.confirmationnumber = hb.key
where clientcode = &clientcode and
artifacttype = 'CrfResult') by hadoop;
disconnect from hadoop;
quit;
Any help or suggestions would be greatly appreciated.
Thanks,
Ben
Hi @miller3b
I have no experience dealing with Hive going to HBase. The 32k issue is a known problem with Hive. This GitHub may help you.
https://github.com/Jeff-Bailey/SGF2016_SAS3880_Insiders_Guide_Hadoop_HOW
Take a look at the chapter on 32k String Thing. It's Exercise 5.
Best wishes,
Jeff
Excellent. Thank you Jeff. I'm sure there are some other tricks in here that I need as well.
Hi miller3b,
I'm glad you found some useful info! If one of the replies was the exact solution to your problem, can you "Accept it as a solution"? Or if one was particularly helpful, feel free to "Like" it. This will help other community members who may run into the same issue know what worked.
Thanks!
Anna
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.