Hi,
I cannot find a suitable post to this question, but would welcome any pointers if it already exists....
We have just started using VA (7.4) for some of our reporting requirements, and have so far used different methods to load our data into the LASR server. Each of these have some benefits and drawbacks but I was wondering if people had some good best practice (practical examples or just theoretical to start) for how to do this.
Main problems:
Options we have considered:
Any advice would be very welcome, as would any links to other ideas or practical solutions!
Thanks!
Hello @DominicRehn,
as default suggestion, I highly recommend to use the AutoLoad functionality, quite useful. It restart the LASR server if stopped and loads/updates the data as required. Not 100% sure if AutoLoad would work for distributed LASR nodes.
I think the ETL process is also good, specially recommended when you have distributed LASR, because it gives you the opportunity where to take the data from, tune performance and many other options. With the addition that you can "plug it" within your current ETLs.
This approach would require a separate script in your LASR nodes to start LASR before the ETL load starts, then you would be good to go.
I would like to advise against manual loads as much as possible. It is OK when you have a few tables, but a mess and chaos when you start having 50+ tables to load manually. Reload-on-start is still a good idea, but then you might be out of control in some scenarios.
I would also like to advise against Data Builder Queries and to schedule them: a VA server should not be used as an ETL server, not the purpose and it has its impact. Data Builder Queries are useful for only little processes and to prepare data for Data Explorations (a couple of users), not for the daily operations.
Manually load tables through VA Administrator
You can also enable reload-on-start, so tables that were loaded manually will be reloaded when the LASR server is restarted. Please note that not all tables are eligible for reload-on-start, please see more details in VA Administrator Guide.
Use an autoload library
I suggest using autoloading functionality.
Hello @DominicRehn,
as default suggestion, I highly recommend to use the AutoLoad functionality, quite useful. It restart the LASR server if stopped and loads/updates the data as required. Not 100% sure if AutoLoad would work for distributed LASR nodes.
I think the ETL process is also good, specially recommended when you have distributed LASR, because it gives you the opportunity where to take the data from, tune performance and many other options. With the addition that you can "plug it" within your current ETLs.
This approach would require a separate script in your LASR nodes to start LASR before the ETL load starts, then you would be good to go.
I would like to advise against manual loads as much as possible. It is OK when you have a few tables, but a mess and chaos when you start having 50+ tables to load manually. Reload-on-start is still a good idea, but then you might be out of control in some scenarios.
I would also like to advise against Data Builder Queries and to schedule them: a VA server should not be used as an ETL server, not the purpose and it has its impact. Data Builder Queries are useful for only little processes and to prepare data for Data Explorations (a couple of users), not for the daily operations.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.
Find more tutorials on the SAS Users YouTube channel.