BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Not getting any nibbles in the VA community, so thought I'd try here for a design perspective..

 

SAS v9.4 (Linux) & VA v7.3 on a 6-node Cloudera Big Data Appliance (BDA).

 

Looking to better understand loading options loading Hadoop data (Hive) into LASR libraries - Reload-On-Start, AutoLoad, scheduling, etc.

 

Preferences:

  • Stored data is Hive, so that other software (not just SAS VA) can access it.
  • Load the data from disk directly to in-memory LASR, without being piped via an external SAS server running SAS code.
  • Avoid storing SAS-native copy of the data - SASHDAT (for Reload-On-Start) or SASBDAT (for AutoLoad).

Q's:

  • What Reload/AutoLoad capabilities do we have with Hive data?
  • If we HAVE to make a (SAS-specific) copy of the data, would appreciate recommendations for Reload / AutoLoad.

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
JuanS_OCS
Amethyst | Level 16

Hello @AndrewHowell , how do you do?

 

I saw your question in the VA community but I thought someone wiser than me would answer. I hope you have better luck here!

 

Anyway, probably what I will say won't help you much, but here you go:

 

My understanding is that there is no direct patch to what you would like to achieve. Your main bottleneck is because of the limitations on that Reload-on-Start supports with the Data provider Library (SASHDAT indeed) and what the Autoload (Base tables, SASBDAT) supports.

 

In regards of Autoload, your chance would be creating an NFS interface to Hive, but then you would lose all the performance Hive can give you, still you would have SASBDAT files as required by SAS Autoload.

 

All in all, I think you would need some custom nightly SAS process to translate the Hive native data into SASBDAT and SASHDAT, depending on the method you would like to use. And, perhaps, also from SASBDAT to SASHDAT if you would like to use the data created by users back into Hive data.

 

Is your Cloudera BDA co-located to SAS VA?

 

While Autoload is generally recommended, in your case, specially if you have self-service and your users can create or load data, I would just stick to Reload-on-Start.

 

Does any of above help you on any way?

 

Best,

Juan

View solution in original post

3 REPLIES 3
JuanS_OCS
Amethyst | Level 16

Hello @AndrewHowell , how do you do?

 

I saw your question in the VA community but I thought someone wiser than me would answer. I hope you have better luck here!

 

Anyway, probably what I will say won't help you much, but here you go:

 

My understanding is that there is no direct patch to what you would like to achieve. Your main bottleneck is because of the limitations on that Reload-on-Start supports with the Data provider Library (SASHDAT indeed) and what the Autoload (Base tables, SASBDAT) supports.

 

In regards of Autoload, your chance would be creating an NFS interface to Hive, but then you would lose all the performance Hive can give you, still you would have SASBDAT files as required by SAS Autoload.

 

All in all, I think you would need some custom nightly SAS process to translate the Hive native data into SASBDAT and SASHDAT, depending on the method you would like to use. And, perhaps, also from SASBDAT to SASHDAT if you would like to use the data created by users back into Hive data.

 

Is your Cloudera BDA co-located to SAS VA?

 

While Autoload is generally recommended, in your case, specially if you have self-service and your users can create or load data, I would just stick to Reload-on-Start.

 

Does any of above help you on any way?

 

Best,

Juan

AndrewHowell
Moderator

Hi Juan, hope you are well.

 

Thanks for your response - it confirms what I suspected (but was hoping for something better from someone who knew more than me).

 

I also have a ticket logged with SAS Technical Support, so I'll see if they come up with anything.

 

Regards,

Andrew.

JuanS_OCS
Amethyst | Level 16

Hi Andrew, thank you, I am fine, I hope you too.

 

I do hope that someone more knowledgeable can give you a more possitive answer!

 

In case you get a better answer from SAS TS, could you please share? 🙂 I always feel eager to learn.

 

Good luck there.

 

Juan

 

 

suga badge.PNGThe SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment. 

Join SUGA 

Get Started with SAS Information Catalog in SAS Viya

SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 834 views
  • 5 likes
  • 2 in conversation