BookmarkSubscribeRSS Feed
Evgeny_
Fluorite | Level 6

Hi All,

 

We have two separate environment (both SAS 9.4). In one environment (on premise Linux) we have all our flows, DI jobs and therefore all warehouse tables. In another environment (multiple AWS instances) we just run our distributed VA (7.3). Currently, we zip up our warehouse tables, push them to VA server, unpack them and autoload to LASR. It involves a number of scripts to be scheduled, and also requires us to have enough space to drop the archive which raises our AWS cost.

 

We would like to write from DIS jobs directly to LASR however our VA environment does not have SASConnect. A suggestion was, to write the code defining LASR libname, and running the data step:

 

data lasr_lib.table_name; 
    set source_lib.table_name;
run;

 

That way did not work, as the table did not load to LASR (and it also has to be unloaded beforehand). However, the abovce piece of code work with (append=yes) option. Hence, I don't have unload the table but just purge all the records from LASR table and append fresh data. However, I am not sure how APPEND will perform for huge datafiles.

 

 

My question is, what is the best suggestion to push tables to VA LASR if it is a separate environment. May be push them to VA HADOOP first and then locally to LASR.

 

Thanks!

5 REPLIES 5
alexal
SAS Employee

@Evgeny_,

 

You need to use the SAS LASR Analytic Server Access Tools. Beginning with the third maintenance release for SAS® 9.4, SAS® Integration Technologies includes the SAS LASR Analytic Server Access Tools. The SAS LASR Analytic Server Access Tools include two engines: the SASIOLA engine and the SASHDAT engine. These engines make it possible to copy data from an environment without a SAS LASR Analytic Server to a remote SAS LASR Analytic Server or Hadoop Distributed File System (HDFS).

 

SAS Usage Note 56996: Tips for using the SAS® LASR™ Analytic Server Access Tools

 

Let me know if you have any questions.

Evgeny_
Fluorite | Level 6

@alexal,

 

I guess SASHDAT might be a possible solution for me. However, when trying to run the libname statement, it throws an error:

 
ERROR: Failed to enumerate available compute nodes in the distributed computing environment. Make sure that the host and install location are specified properly and that you can make a connection via passwordless ssh to the host machine.
 
But if I run the same libname locally in VA - it works. Do I have open any specific port on VA side? Passwordless ssh is already established between two nodes and I can ssh and telnet to port 22.
 
Thanks!
alexal
SAS Employee

@Evgeny_,

 

That's right, you have to configure passwordless SSH between these environments and specify username and SSH key in options with name TKSSH_USER & TKSSH_IDENTITY.

SASKiwi
PROC Star

If @alexal's suggestions won't work for you because you are on an earlier SAS 9.4 maintenance level, then we were able to negotiate with SAS to get a free limited SAS/CONNECT license to solve this problem. We are on SAS 9.4M2 connecting to SAS VA 7.3 (SAS 9.4M3).

 

SAS/CONNECT works brilliantly to enable end-to-end loading of VA from our primary SAS environment all in a single job.

ThomasPalm
Obsidian | Level 7

One simple thing to try, is to use the compress option on your dataset from DI.

You don't have to uncompress before load.

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Tips for filtering data sources in SAS Visual Analytics

See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1713 views
  • 0 likes
  • 4 in conversation