- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have to load a tabel from work to public libname and I have to promote it in a global scope for reporting with VA.
I'm trying to use this code:
proc cas;
session casauto;
table.loadTable /
path="tab_test.sas7bdat"
casout={name="tab_test" caslib="public" indexVars={"my_id"} replication=0 promote=true};
table.tableDetails / table="tab_test" caslib="public";
run;
When I try to load the table I have this error:
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This error indicates, that the file tab_test.sas7bdat does not exist in the path of your current CAS library. You can use action table.fileinfo to check which files do exist in your CAS library.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Excuse me for this delay.
The problem is that the table that I want to load isn’t in a CAS library, but in WORK library.
I need the proc cas to create an index.
Could I load the table and after create an index?
I need also to promote the table for visual analytics.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
To load a SAS Data set in SAS Library you use for instance Proc CASUTIL with the LOAD statement. Indexes defined on the SAS Data Set are not automatically created on the CAS table.
See here for more information on indexes in CAS https://go.documentation.sas.com/?docsetId=caspg&docsetVersion=3.4&docsetTarget=p1xt9526uq5etwn1vmnk...
I suggest to try it first without using an index.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
Yes I tried to apply an index but it seems that I must have two tables in-memory: the input table and the output table (with index).
This is a problem because the table is very large and I have not too much space to store two tabels in-memory.
The index is need to speedup the proc cas with the action "update", because I need to do a where to search what Item update.
Out of the proc cas I haven't found any solution to update a table promoted in cas.
Is there a better way?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I would choose one of the methods listed in the doc where you create an index while loading from a file in a CAS Library.
Tell us a bit more about your use case like where does the data come from, what happens to the data before it is loaded to CAS and why does it need to be update frequently.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Ok, following the use case:
I have a large table in CAS (300,000,000 rows).
I need to update it with a transactional dataset.
To do this i wrote a program: the new rows are appended. The rows that I have to update are updated by a datastep that reads the rows from the transactional dataset and foreach row does a call execute that create a proc cas with an update action.
To do this with an index would be better for performance.
Scheduled once in the night I have a program that drop the master table e recreate it.
The problem is to create it without an input table in CAS (in RAM).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So here is what I understood:
- We have a table, 300,000,000 obs, I guess something like 70-100GB
Where is the original data coming from?
Is there any processing before loading to CAS? - This table is loaded from a SAS Data Set to CAS table every night. An index should be created while loading.
- During the day, the CAS table will get new data, either append or update, input is a SAS Data Set again.
How many update operations? - At the end of the day the processing starts new
Will the CAS table simply be deleted?
If your SAS Data Set would be in a directory accessible as a CAS library (path based) you can use the table.loadTable action and it will load and build the index at the same time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes,
the table is large: forecast on the size is up to 300GB.
The process is an ETL to extract and transform data from Oracle. The ETL is scheduled about every 5 minutes.
Every 5 minutes it needs to update only some rows and append the new rows.
Input is a dataset.
How many update operations? Near to 15000 updates every 5 minutes.
At the end of the day the entire table must be refreshed (drop and create).