SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

SAS/ACCESS to Hadoop: Hive Partition Tables - SAS 9.4 M3

Reply
Contributor
Posts: 39

SAS/ACCESS to Hadoop: Hive Partition Tables - SAS 9.4 M3

Does anyone know if SAS/ACCESS to Hadoop in 9.4 M3 supports creating partition files in Hive in a non-text format i.e. ORCFile?

 

I've tried to specify creating a table as an ORCFile usign the LIBNAME option in SAS 9.4 M2: 

 

DBCREATE_TABLE_OPTS="stored as ORCFile"

 

However, when I add the data set option to create a partition file: 

 

DBCREATE_TABLE_OPTS="PARTITIONED BY (x_facility_offer_cd VARCHAR(4))"  

 

The resulting HiveQL is showing the table is being created as a TEXTFILE:

 

PARTITIONED BY (x_calling_system_cd VARCHAR(4))  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' LINES TERMINATED BY '\012' STORED AS TEXTFILE 

 

Also, it seems that indexes on Hive tables can't be created using the SAS/ACCESS to Hadoop in SAS 9.4 M2, is this fixed in M3?

 

The indexing is an issue, as we can't partition ORCFile(which are pre-otimised with an internal index), so we're left with un-optimised TEXTFILE formats.

 

Any help / info would be appreciated.

 

Thanks

 

 

David

Super User
Posts: 5,441

Re: SAS/ACCESS to Hadoop: Hive Partition Tables - SAS 9.4 M3

"In the third maintenance release for SAS 9.4, these features are new or enhanced.

  • You can use the new CONFIG= and CONFIGDIR= LIBNAME and data set options to define the name and location of bulk-load configuration files.
  • Support for DBCREATE_TABLE_OPTS= LIBNAME option is new."
     
    To verify your specific case, you might want to ask SAS tech support to verify that this would work once you upgrade to M3.
Data never sleeps
Ask a Question
Discussion stats
  • 1 reply
  • 514 views
  • 0 likes
  • 2 in conversation