I am trying to Copy data from LASR server to HDFS as a back up. Data in LASR is a compressed dataset with around 25 Million records.
I am using "Proc Oliphant" to this task, but it is taking lot of time approx 1.5 hours.
proc oliphant host = "&host"
install = "&tkgridinstall";
add &secdat. path="&hdfspath" replace;
Is there any other efficient way to do this task?
what is &secdat in your example? Is it a table sored on LASR? Are you using the SASIOLA libname engine?
PROC OLIPHANT can load a table from SAS server (or a table accessible from SAS server) to SASHDAT. This is not what you need. You want to save an in-memory LASR table (SASIOLA) to disk (SASHDAT). Use IMSTAT instead.
Can you share the IMSTAT code you have used? Also the libname statement that creates the libraries.
What is the version of SAS you are using? What is the size (in Gbytes) of the table? How many nodes and memory do you have?
Thanks for yopur response.
A) &secdat is the LASR table as you guessed and I am using SASIOLA engine
B) Below is the IMSTAT code that I had used
C) We are using SAS version 9.4,
Dataset that I am trying to back up is an compressed LASR table with 2.59 GB
In LASR we have 3 nodes and Memory available is 1.5 TB i.e. 500 GB per node
LIBNAME VALIBLA SASIOLA TAG=HPS PORT=&port HOST="&host" SIGNER="&signer" ;
proc imstat data=VALIBLA.&dat;
Where account_number is not null;
save path="/hps" copies=1;
Are we missing anything in above code, as even
1) even when I tried to save a small datasets of 2 MB, its did execute for 20 minutes without completion and I had to kill it.
2) Can we execute above code without "Where" clause?
Thanks for the response. Since yesterday, I am trying to Post a new question but getting below error.
Not allowed to post content more than once every 60 seconds
Hence posting the question in the same thread. Not sure if this a right way to go about it. Any insight would be appreciated
I have a question on the LASR server Join. We have copied both the tables in LASR server and doing SQL JOIN using SAS DI join transformation.
My question is, in this scenario
A) Would Join processing happen in LASR server?
B) If Join happens in Workspace server, then would it copy both tables from LASR to workspace and perform JOIN there?
So would it occupy both Disc space and memory OR only memory?
If you use PROC SQL:
B) Yes, on the Workspace Server. In this case LASR would act just as a simple data provider.
Instead use PROC IMSTAT SCHEMA statement.
Or PROC IMSTAT SCORE stement with a hash object.
Or if you store tables in Hive/Impala use the hadoop/impala access engines with PROC SQL.
Use native Haadoop tools.
We did a small exercise of Executing the ETL script and monitoring the Work area and Resource Utilization while ETL is executing.
In ETL script, we are copying data from SAS to LASR and doing Join using SAS DI transformation.
While doing this We monitored SASWORK area space to check if the usage goes up during the execution, and we did not see any difference in the available space before and After execution. Is there something that we missed to capture during this exercise?
Of course there will be no difference in available disk space before and after the execution, because SAS cleans up after executing (joining) and uploading the results.
But you also write, you were monitoring while ETL is executing.
Also it might be possible, that you have used small datasets, and everything happened in the memory of the SAS Workspace Server.
Some options that can help monitoring:
options sastrace=',,,d' sastraceloc=saslog;
Also check the UTILLOC option. Sometimes it points to a different location than WORK.
Could you attach the code that was generated by DI Studio?
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.
Find more tutorials on the SAS Users YouTube channel.