Hi,
I have a data in SAS which gets updated at the regular interval (Every Hour). I want to update records from the SAS server in the datasets in HDFS and LASR server.
Any help would be appreciated.
Thanks,
Nikhil
Hi,
There are many ways to accomplish this.
By "dataset in HDFS" you mean your data is in SASHDAT format?
What do you mean by "updated"? Do you have new rows, or old rows are changing? Or the whole dataset is completely new?
One example to add (append) observations to a SASHDAT in-memory file:
data HDATLib LASRLib.hdatFile(append=yes);
set sourceLib.sourceFile;
where recordStatus='NEW';/*put a where statement here that identifies new observations. For example: &lastLoadTime. <= timeStamp*/
run;
After that you can reload the table that is in the LASR memory from the SASHDAT table.
After that you can synchronize the SASHDAT table.
But you could update the rows directly in the LASR memory table (using PROC IMSTAT), then synchronize the associated SASHDAT table.
Message was edited by: Gergely Bathó I was mixing up the table options for SASHDAT and SASIOLA. Currently there is NO append=yes table option for the SASHDATengine.
Hi Gergely,
Thanks for Responding.
A) Yes the dataset is in SASHDAT format
B) By update I mean "Old rows are changing"
e.g. Account is the Key variable. We want update the change in Other columns using Key variable Join between SAS and HDFS tables.
Like Proc SQL update.
Nikhil
Do you have an associated in-memory table?
First I would update the in-memory table, then synchronize the SASHDAT table (if needed).
The first task can be done with the PROC IMSTAT UPDATE statement:
SAS(R) LASR(TM) Analytic Server 2.2: Reference Guide
The second task with the PROC IMSTAT SAVE stement:
Hi,
i have the same problem but i don't understand how can i solve it.
I have table1 which is in memory and table2 which is the table to use for the update.
table2 can have rows that exist in table1 (update) or new rows (append).
How can i use proc imstat update statement in this case?
Thank you!!!
To be more precisely:
data table1;
input ID $ 1-2 data $ 3-10;
ID_2=ID+1;
datalines;
1 sysdate
2 sysdate
3 sysdate
4 sysdate
5 sysdate
6 sysdate
7 sysdate
8 sysdate
9 sysdate
10 sysdate
;
run;
data table2;
input ID $ 1-2 data $ 3-10;
ID_2=ID+20;
datalines;
1 sysdate1
2 sysdate2
3 sysdate3
;
run;
After loading tables in memory (we are in a distributed env) i used this program:
proc imstat data=valibr.table1;
UPDATE data=valibr.table2;
run;
fetch / format ;
quit;
Unfortunately it doesen't work. The result is:
Rows from the table | ||
HPS.TABLE1 | ||
ID | data | ID_2 |
3 | sysdate3 | 23 |
3 | sysdate3 | 23 |
3 | sysdate3 | 23 |
3 | sysdate3 | 23 |
3 | sysdate3 | 23 |
3 | sysdate3 | 23 |
3 | sysdate3 | 23 |
3 | sysdate3 | 23 |
3 | sysdate3 | 23 |
3 | sysdate3 | 23 |
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.
Find more tutorials on the SAS Users YouTube channel.