What is the best approach to refresh a very large, stale LASR table (read FACT table) when the source data gets updated? The question is even more important when there is only a narrow downtime window for the update. The standard approach to append/update delta data from the source environment to the LASR table may not help, if you are unable to isolate the changes since the last load.
This post discusses a mechanism for swapping data into a live LASR table from a staged LASR table, minimizing downtime while still doing a complete refresh. The data transfer operation executes in LASR (memory) so it’s super-fast and efficient with minimum down time.
To refresh a stale LASR table from source data quickly, follow these steps:
The above steps can be executed using different SAS statements depending on your preferences and exact situation. You can use PROC IMSTAT, SAS Data Step, or PROC IMXFER to swap data into a live LASR table from staged LASR table. Generally, we recommend IMSTAT but Data Step may be preferred by legacy SAS users and IMXFER is useful if you want to use a separate LASR server for staging purposes.
Using PROC IMSTAT
The following code demonstrates a data swap into a live LASR table using PROC IMSTAT. In this case, the source data is residing in a .sas7bdat table. The same data could have been in a third party database with a view created using multiple tables to generate the flat FACT table. The initial step in the code is to load the 10GB dataset table into LASR staging table. Then, the PROC IMSTAT DELETEROWS statement deletes and purges all rows from target LASR table. Finally, SET statement appends the staging table rows into target LASR table.
In this case, a 10GB (36M rows) table takes only ~ 8 sec to execute PROC IMSTAT statements to refresh the original table !!!
Note: On the SET statement, the source LASR table does not require a SAS LIBNAME reference. BY default, the SET statement looks for the source data LASR table in same LASR server.
Note 2: If you need a partial data refresh to old LASR table, delete the data rows with valid WHERE clause expression and append the new rows from staged table.
Code with PROC IMSTAT:
OPTION SET=GRIDINSTALLLOC="./opt/sas/TKGrid";
OPTION SET=GRIDHOST="sasserver01";
libname lasr1 sasiola port=10010 tag='HPS';
libname ld "/opt/sas/data/AppalachianChocolate_datamart";
* Drop staging table, if there is any ;
%vdb_dt(lasr1.ORDER_FACT_N);
* Load fresh data sets into a staging ORDER_FACT_N LASR table ;
data lasr1.ORDER_FACT_N ;
set ld.ORDER_FACT_10G;
run ;
* Swap/Refresh old LASR table data from staged LASR table ORDER_FACT_N;
proc imstat
data=lasr1.ORDER_FACT_10G;
where 1=1 ;
run;
deleterows / purge;
run;
table lasr1.ORDER_FACT_10G ;
set ORDER_FACT_N / drop ;
run;
Log extract from code execution:
9
10 * Load fresh data sets into a staging ORDER_FACT_N LASR table ;
11 data lasr1.ORDER_FACT_N ;
12 set ld.ORDER_FACT_10G;
13 run ;
NOTE: There were 36000000 observations read from the data set LD.ORDER_FACT_10G.
NOTE: The data set LASR1.ORDER_FACT_N has 36000000 observations and 24 variables.
NOTE: DATA statement used (Total process time):
real time 59.62 seconds
cpu time 17.07 seconds
14
15 * Swap/Refresh old LASR table data from staged LASR table ORDER_FACT_N;
16 proc imstat data=lasr1.ORDER_FACT_10G;
17 where 1=1 ;
18 run;
19 deleterows / purge;
20 run;
NOTE: A WHERE clause remains active from a previous RUN block: '(1)'.
NOTE: The LASR Analytic Server action request for the DELETEROWS statement
completed in 0.159001 seconds.
21
22 table lasr1.ORDER_FACT_10G ;
NOTE: The WHERE statement is cleared when you open a LASR Analytic Server table with the
TABLE statement.
23 set ORDER_FACT_N / drop ;
24 run;
NOTE: The LASR Analytic Server action request for the SET statement completed in 2.442395 seconds.
NOTE: The PROCEDURE IMSTAT printed page 1. NOTE: PROCEDURE IMSTAT used (Total process time):
real time 7.89 seconds
cpu time 2.72 seconds
Using SAS data step
DATA Step can also be used to refresh LASR tables in the same manner. When running SAS data step in LASR server, be sure to include “options dsaccel=any” in code, otherwise it will pull the LASR data to SAS Workspace server to process.
Code with SAS data step:
OPTION SET=GRIDINSTALLLOC="/opt/sas/TKGrid";
OPTION SET=GRIDHOST="sasserver01";
options dsaccel=any msglevel=i;
libname lasr1 sasiola port=10010 tag='HPS';
libname ld "/opt/sas/data/AppalachianChocolate_datamart";
* Drop staging table, if there is any ;
%vdb_dt(lasr1.ORDER_FACT_N);
* Load fresh data sets into a staging ORDER_FACT_N LASR table ;
data lasr1.ORDER_FACT_N ;
set ld.ORDER_FACT_10G;
run ;
* unload original old LASR table from LASR server;
%vdb_dt(lasr1.ORDER_FACT_10G);
* Swap/Refresh old LASR table data from staged LASR table ORDER_FACT_N;
data lasr1.ORDER_FACT_10G ;
set lasr1.ORDER_FACT_N;
run ;
* Drop staging table;
%vdb_dt(lasr1.ORDER_FACT_N);
Using PROC IMXFER
Finally, PROC IMXFER can perform the same logic except that it executes across two LASR servers. When using IMXFER, you write the staging table to a separate LASR server (different port). Using a separate LASR server for staging, can help minimize the impact of your back-end processing from your front-end VA users
Code with PROC IMXFER:
OPTION SET=GRIDINSTALLLOC="/opt/sas/TKGrid";
OPTION SET=GRIDHOST="sasserver01";
libname lasr1 sasiola port=10010 tag='HPS';
libname lasr2 sasiola port=10031 tag='HPS';
libname ld "/opt/sas/data/AppalachianChocolate_datamart";
* Drop staging table, if there is any ;
%vdb_dt(lasr2.ORDER_FACT_N);
* Load fresh data sets into a staging ORDER_FACT_N LASR table differnt LASR server port ;
data lasr2.ORDER_FACT_N ;
set ld.ORDER_FACT_10G;
run ;
* unload original old LASR table from LASR server;
%vdb_dt(lasr1.ORDER_FACT_10G);
* Transfer data from stagged LASR table to old LASR table, both LASR table on different server port ;
proc imxfer;
server s1 host="sasserver01.race.sas.com" port=10010;
server s2 host="sasserver01.race.sas.com" port=10031;
table s2 hps.ORDER_FACT_N s1 hps.ORDER_FACT_10G;
quit;
* Drop staging table from LASR ;
%vdb_dt(lasr2.ORDER_FACT_N);
A Note on Memory Requirements
The staging table load and swap-out process briefly requires twice the normal LASR table memory storage since there will be two copies of the table in memory, the staged one and the one being updated. Once the staging table is dropped by IMSTAT, this requirement goes away.
Reference documents IMSTAT Procedure IMXFER Procedure Data step programming for LASR
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.