BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rgreen33
Pyrite | Level 9

I am new to the SAS world, and I am working on getting things setup in my production environment.  I have Oracle data that is going into Hadoop and then on to LASR memory.  Obviously, when my LASR server is restarted, the data in memory is lost.  Thus, on restart, I need to reload my data from Hadoop into LASR.  Can someone confirm that I have this process correct...or, suggest a better way of doing this.  My process is as follows:

 

Job:  Load data from Oracle to Hadoop

     -  Schedule this job to refresh Hadoop data from Oracle

 

Job:  Load data from Hadoop to LASR

     -  Schedule this job as needed to refresh LASR data from Hadoop (as often as needed, based on changes in Oracle)

     -  Schedule this job to run after LASR start, to load data from Hadoop to LASR after LASR server has been restarted

 

Is my thinking correct?  Any way to do this via AutoLoader?  Any better suggestions?

 

Thanks,

Ricky 

1 ACCEPTED SOLUTION

Accepted Solutions
JBS_SAS
SAS Employee

I think your process is a good one.  We do something similar in the SAS IT department when we restart VA.  All you need to do is to quickly lift from HDFS to LASR so that customers can see the data again after a restart.  Furthermore, we like to have this process run as fast as possible so typically we diable metadata updates after a restart b/c it should be in sync already from the previous load.

 

View solution in original post

4 REPLIES 4
JBS_SAS
SAS Employee

I think your process is a good one.  We do something similar in the SAS IT department when we restart VA.  All you need to do is to quickly lift from HDFS to LASR so that customers can see the data again after a restart.  Furthermore, we like to have this process run as fast as possible so typically we diable metadata updates after a restart b/c it should be in sync already from the previous load.

 

SKG
Obsidian | Level 7 SKG
Obsidian | Level 7
Where is the solution for this question?
Actually I'm facing the same problem in my SAS VA environment. Can you please assist me on the same quary?

"How do you load data from Hadoop to LASR automatically after restart of LASR server? "
rgreen33
Pyrite | Level 9

@SKG,

 

The solution that I ended up with is not yet "automatic".  Essentially, I have several "pairs" of jobs setup into flows...one job loads HDFS, the other loads LASR.  Then, I have a separate flow that essentially contains all of my LASR load jobs.  This job gets run manually during the startup process...once Hadoop and LASR are both up.  So, my schedule manager looks similar to the following:

 

(Flow) _Load_LASR_ALL      <----- This is the Flow that I run manually on Statup

                 (Job) Table1_Load_to_LASR

                 (Job) Table2_Load_to_LASR

 

(Flow) Load_Table1

                 (Job) Table1_Load_to_HDFS

                 (Job) Table1_Load_to_LASR

 

(Flow) Load_Table2

                 (Job) Table2_Load_to_HDFS

                 (Job) Table2_Load_to_LASR

 

We are in the process of getting some playbooks setup for Ansible.  We hope that we can automate our startup process (including calling the flow or jobs listed above).

 

Hopefully this helps.  If you have any questions, please let me know.

 

Thanks,

Ricky

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Tips for filtering data sources in SAS Visual Analytics

See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1724 views
  • 0 likes
  • 3 in conversation