Hi, I am trying to fill in the blanks on a use case and have some constraints that I'm facing. > can't introduce any UDF not natively in hortonworks offering of hadoop > cannot touch permanent data persistence solution within hadoop We are currently storing some JSON objects in HBase with mapping through HIVE in order to enable the SAS connector to work directly with the data (reliant on hiveserver2). No problems with data access itself. The SAS limitiations on hive tables and proc sql make it that any return has to be below 32XXX characters which is not acceptable for this solution. The JSON reader for SAS can handle but will need to consume from a file. The SAS software site outside of the cluster and not directly on any nodes. The code I have been using is pasted below. The code executes fine and creates a comma separated file to work with as far as the query goes. The reader engine I am using needs the entire file to conform to JSON standards which would mean adding an opening/closing [ ] to make it appear as a array type (SAS JSON Reader 9.4.4). Is there a way to specify adding these characters in the initial write? If writing the file to HDFS how would I go about appending (or rather copying merging) in the process through the SAS Hadoop connector? If not is there a good approach to maybe bringing the file local and adding the characters as needed for the reader? Current: {JSON Data}, {JSON Data} Desired (not sensitive to line feeds from reader perspective): [{JSON Data}, {JSON Data}] proc sql; connect to hadoop (XXXXXX - connection string information); execute (INSERT OVERWRITE DIRECTORY '/home/bmiller' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' SELECT JSON from results r inner join hbaseresults hb on r.confirmationnumber = hb.key where clientcode = &clientcode and artifacttype = 'CrfResult') by hadoop; disconnect from hadoop; quit; Any help or suggestions would be greatly appreciated. Thanks, Ben
... View more