BookmarkSubscribeRSS Feed
nikhil_khanolkar
Calcite | Level 5

Hi,

 

As part of the requirement, we need to copy tables/data from LASR server (In memory data) to the SAS server and store as a physical tables. We need to copy this multiple times in a day and data size is around 900 GB.

 

What is the most efficient way to perform this operation?

 

Any suggestions..?

 

Thanks,

Nikhil

9 REPLIES 9
Kurt_Bremser
Super User

AFAIK, the LASR server is represented to Base SAS as a library with it's special engine. So a simple data step should suffice.

The bottleneck will most probably be the write performance of your standard SAS server and the connection between the LASR and the standard server.

Depending on the data structure, compressing the target data set might (even dramatically) improve your performance.

LinusH
Tourmaline | Level 20
Sounds like an odd data flow direction from a data management point of view.
A natural flow imo would be data source -> SAS Server (ETL) -> LASR co-located storage -> upload to LASR.

Are you using a distributed VA environment?
What is you requirements and why have you decided to store/load data this way?
Data never sleeps
nikhil_khanolkar
Calcite | Level 5

Hi Linus,

 

We are trying to leverage reload at start functionality and hence want to copy/back up data from LASR to SAS server i.e. co-located storage, so it can be backed up to LASR memory at the time of server re-start.

 

Few facts for your reference.

 

a) LASR table is updated multiple times in a day with the security/Auth rules.

b) because of the infra challenges we decided to take Data Server --> SAS ETL ---> Upload to LASR flow at the start of the project.

c) Now we are close to completion and re-designing a solution would involve lot of efforts. so planning to back up data from LASR to SAS server i.e. co-located storage to leverage re-load at start.

 

Using a Data step might be slower.

 

And FTP would not work since we are copying data from memory to Physical server I suppose?

 

Thanks,

Nikhil

 

Kurt_Bremser
Super User

A DATA step is usually the fastest method of moving data within SAS, especially when it involves a transformation (here from LASR in-memory storage back to Base SAS dataset). Your limits will not be found within SAS, but in the I/O subsystem of your server(s) and the network.

 

What did you already try, and what was the result?

LinusH
Tourmaline | Level 20

Maybe I'm missing a piece here, but wouldn't a synchronization between LASR and the co-located storage be the by far the most efficient routine?

Also, I'm bit concerned about LASR data is being updated frequently. Usually. LASR data is a basis for analytics, not a database that should host frequent updates. What is your master data store? What kind of data and updates are we talking about?

 

Data step may one of the fasted way of processing data (in a single threaded environment, outside LASR/VA). But the trick is minimize movement between MPP an SMP environments. There is a reason why SAS has invested in MPP, that is because of the huge data volumes that it can host.

Data never sleeps
nikhil_khanolkar
Calcite | Level 5

Hi,


Our master data store is SAS. on this data Account level user authorization information is processed and data is hosted in LASR. Row level security is applied in LASR to be used by SAS VA for dashboard. . LASR Updates we mentioned are about change in access/autorization information.


After applying user authorization information size of the data gets increased exponentially. As mentioned earlier beacuse of the infra challenges we opted for this routine. and unfortunately we could not use Out of box synchronization unless this data is copied back to the Physical location in SAS.

Hence looking out for suggestion on the what could be the most eficient way to do this. We have not tried Data step so not sure about the performance yet.


Nikhil

LinusH
Tourmaline | Level 20

Sorry, my brains is kinda slow sometimes, so just to see if I understand you correctly:

  • You update a authorization table frequently in SAS, outside VA
  • You reload this table, with other data that uses authorization, causes a creation of products (in joins)
  • You are using Data Builder to do this?

So far, it seems that to co-located storage is out of the picture.

But still, if the reason is just to quickly reload the latest version of all data, save the current data in LASR to the co-located storage should be the fastest way to get things up and running.

 

can't you use the build-in record level authorization within VA, in conjunction with the use of Star Schemas?

If so, you just reload the the authorization table. The quite smaller (than the current structure) analysis data can bu quickly loaded separately (by the co-located storage 🙂  )

Data never sleeps
nikhil_khanolkar
Calcite | Level 5

Hi Linus,

 

That was a joke right..:) If your brain is slow (even sometimes) in the SAS DWH/BI area then god knows if someone like me even has a brain..;)

 

Your understanding is correct in the 3 bullet points you mentioned. Just one correction, we are not using Data builder. We are using SAS DI/ETL for doing this processing.

 

In a small POC we did, it was noticed that Star schema approach is impacting the end user performance, so did not use that.

We are using  build-in record level authorization within VA. thats the reason data size in LASR is increased as we ended up storing one record per user based on the access rights.

 

One question. I am assumimg by co-located storage location you meant a Base SAS location? If yes then we are planning store/back up data in LASR to the this location. and looking out for suggestion for th most efficient way to do this.

 

Or I am missing your point here. 🙂

 

Nikhil

LinusH
Tourmaline | Level 20

Co-located storage is only relevant if you have a distributed VA/LASR environment. So I take it from your answers that you are on a single node, right?

If so, yes, "SAS" is your data storage.

I have no experience in how/when to synchronize this kind of set-up. If you are using Base SAS for your main "off line" storage, make sure that it has as fast I/O as possible. You could use SPDE which could speed up read access (to fasten up the load to LASR).

Data never sleeps

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 1879 views
  • 2 likes
  • 3 in conversation