Hello Guys,
I am a question regarding SAS ACCESS.
We use SAS 9.2 and Oracle 11g and have a SAS ACCESS Library
linked to the Oracle DB. Now when I register new tables
is that data actually stored in SAS as a SAS data set or is it
just a "linK"/"view" ?
If it is not stored, I wonder what the physical storage property
"Create as View" is good for (OLAP CUBE STUDIO).
The "create as view" is a SAS METADATA definition. You are registering an existing physical table in your METADATA (which is only a DESCRIPTION of a physical object).
It's important that you understand the difference between a SAS metadata definition and a physical object.
You are using SAS OLAP Studio to define a SAS Cube SAS Metadata. SAS then uses these SAS metadata definitions to generate actual SAS code to be executed and build the physical cube (which is "real" physical data). Your Oracle table is the data source so the code generated (and executed) will only READ from source. "Create as view" is only of relevance when WRITING to a table and the table needs to be CREATED.
Not only SAS OLAP Studio uses SAS Metadata - and the Metadata within a Foundation repostitory in a SAS Metadata Server instance is shared by all SAS clients. So let's say there is also some ETL process populating this Oracle table and this ETL process is implemented using SAS DI Studio (which is the specialised SAS tool for ETL jobs).
You would then define a DI Job in SAS Metadata using SAS DI Studio for this ETL and when you generate actual SAS code out of this DI Job Metadata definition then depending on the table definition the SAS code generated will either be "create TABLE" or if "create as view" had been defined for the target table it will be "create VIEW".
Below just for illustrating what I'm talking about:
I can answer the first part of your question. When you create a library defined on a non-SAS database using SAS/Access, you are definitely only setting up a reference to the underlying database table. No data will be moved or accessed until you actually use one of the tables in SAS code.
I don't have access to Metadata Server, so I can't check out your reference to "Create As View". SAS does have the ability to create views on SAS datasets, similar to SQL views on SQL datasets, so maybe it's related to that somehow?
Tom
Hello Tom,
Thanks for your help.
You wrote "No data will be moved or accessed until you actually use one of the tables in SAS code."
So what happens when I "access" the data. I guess it gets cached? What happens if I have
a 100 GB Oracle Table?
I know the difference between a SAS DataSet and a SAS View (a view on a SAS DataSet).
You also say "No data will be moved or accessed until you actually use one of the tables in SAS code."
what happens when I actually want to have the data stored?
I was hoping the functionality offered by the SAS OLAP CUBE Studio 4.2 to change
the phyiscal storage from table to view by clicking "Create as View" would exactly
mean that, I either pull the data from Oracle and store it or it is indeed just a reference.
Thanks a lot for the feedback.
You're obviously talking about SAS Metadata which is only a Metadata description of an object. If you choses "create as view" or "as table" makes on the metadata level not a big difference.
Now let's say you're using SAS DI Studio. You build a DI job using Metadata (no real physical data involved here). You then can create SAS code based on the metadata used and here it makes a difference if the table object was "create as view" or "table". The generated SAS code would then for example be "proc sql; create VIEW ..." or "proc sql; create TABLE..."
Only when executing the view (or chosing "view data") there will actually be a process pulling data from Oracle into SAS.
"You're obviously talking about SAS Metadata which is only a Metadata description of an object. If you choses "create as view" or "as table" makes on the metadata level not a big difference."
Patrick, thanks for your input. I dont mind what it is called in the Metadata I care about PHYSICAL storage.
As I said in my answer to Tom's post. I dont want to replyicate 100GB of Oracle data in SAS. Its all about the storage space
and performance. "create as view" or "as table" can be selected in SAS OLAP CUBE STUDIO 4.2
(somehow this does not seem to exist in the SAS Management Console) in the table properties tab called...."PHYSICAL STORAGE"
I am not using SAS DI Studio.
To be very clear, by registering tables in SMC (metadata), physical data is not copied to SAS.
Don't exactly under stand what you mean by "Create as a view" property i OLAP Cube Studio? Pls attach some print screen so that we can see where this appears.
"create as view" or "as table" can be selected in SAS OLAP CUBE STUDIO 4.2
Are you talking about the source table here (eg. the detail table). I haven't done a lot of OLAP development but from my understanding you're having some source data (eg. a detail table) and then you create a cube. You can pre-aggregate everything which means it takes longer to process and you need more storage space but it's then much faster for people to access - or you aggregate "on-the-fly" - or some mixture of pre-aggregation and on-the-fly aggregation.
I would assume that if your source data lives in Oracle and you choose "create as view" then your source is defined as SAS SQL view. If you pre-aggregate everything (I forgot how this is called) then a view should be o.k. and it only will affect performance for drill-through, if there is also aggregation on-the-fly then I would assume that a view (where a query must reach through to the database) is slower than if all the data is replicated into SAS. Having said that: Querying Oracle can be multithreaded where querying a SAS table will be single threaded - so querying Oracle could be faster. Alternatively you might consider to store your detail table in SAS using the SPDE engine which also allows for multithreading (even better if using several disks from an I/O perspective).
I assume it will be a decision between storage requirements and performance requirements what path you're taking.
Metadata as such is only a description. No physical data at all is stored. But metadata will affect what code will be generated and this of course will affect if your data gets replicated or not.
Hello Linus Hello Patrick,
Thanks for your input. The screen print is now attached.
Patrick, I dont know why this feature only exists in SAS OLAP CUBE Studio (seems to be one of those great messed
up UI designs and duplicated functionality) but regardless if I want to build a cube or not what does this
functionality actually do, phyiscally ? Lets not talk about how to create cubes.
"I would assume that if your source data lives in Oracle and you choose "create as view" then your source is defined as SAS SQL view."
Interesting! So what is the difference if I tick this box, physically? None?
I dont want to get into aggregation either. Lets focus on SAS DataSets (Tables) and what this feature "CREATE VIEW" does to them.
My data is pre-aggregated and needs therefore tons (100GB) of Oracle space.
"But metadata will affect what code will be generated and this of course will affect if your data gets replicated or not."
Thats could be the right path, but I would like to have that in some kind of document not just say
oh well, if I tick "Create view" less data well be replicated by some code generated. I am having huge
amounts of data here. I need to be sure.
This is a table definition for your source tables. If this is an existing table in Oracle then I would suggest you use SMC, go to the library node and then simply chose "register table". This will register the tables the right way in metadata. If these are Oracle tables and the libname is pointing to an Oracle schema then don't chose "create as view". This would only have an effect if you actually would create the table (and not only read data from it).
If these is metadata for Oracle tables and the libray points to Oracle (using the SAS/Access engine) then you won't replicate data. If you access the data then of course SAS needs to read it and load it into the SAS workspace (so that will take some time).
A pre-aggregated cube means always data replication - aggregation on the fly on the other hand means querying the underlying source table (which can also live in Oracle). It's up to you what you prefer to do.
Chosing "create as view" or not only has an effect if SAS needs to create/write to the table. I don't think this is the case in your process. Again: Use SMC and simply register the tables - don't create the metadata manually.
Hello Patrick,
"This would only have an effect if you actually would create the table (and not only read data from it)."
So how do I do that? Right now, by default, everything seems to be a "link" to the Oracle database,
no data replication, regardless if I tick this box "CREATE VIEW" or not - interesting.
"Chosing "create as view" or not only has an effect if SAS needs to create/write to the table."
Where is that funcionality? The table is registered in the meta data so how do I say,
please SAS "Create" the table, dont just "linK" it?
I know about cubes but I dont want to build a cube so we dont have to talk about aggregating cubes
or not.
"Again: Use SMC and simply register the tables - don't create the metadata manually." -> I dont
create it manually. I used OLAP CUBE STUDIO and noticed it has this odd "CREATE AS VIEW" checkbox
that does not exist in the SMC.
I try to sum it up , so we have
1.table registration in meta data
2.table creation
where does "CREATE AS VIEW" come in?
As Patrick aluded, a view is a set of instructions for accessing/manipulating the data.
A view can be a SAS view, in which case the SAS instructions are passed to Oracle each time. You can also have an Oracle view, though I think that has to be created in Oracle as the creation involves write access to the Oracle database (which I never have, so I can't try it).
If you have a SAS view, then the only thing that SAS stores at the time of creating the view is the instructions. Whether, and how much, data move from Oracle to SAS are a function of the actual query. SAS will do as much as it can on Oracle, but some functions only exist on the SAS side, so the all the raw data might have to be pulled when the query is executed. In the SAS OPTIONS, there are some SQL specific options that will tell you EXACTLY what is being passed to Oracle. Those can be quite helpful in tuning your query.
Hello Doc@ Duke,
many thanks for your reply. That leads me to conclude that the
"create as View" functionality in the "phyiscal storage" tab of the SAS DataSet properties
has acutally no impact on the physical storage because the SAS DataSet, part of the SAS ACCESS/ORACLE
library is actually just a "link" to the Oracle Database where the data is phyiscally stored.
Does that make sense?
Yes, but with a caveat. When you actually USE the view it will end up copying some data to the SAS server to do the SAS end of the work. It may (or may not) be a storage issue at that time.
We have had occasions in which a view was created and everything looked great. Then when the PROC was run, SAS ended up pulling the entire set of Oracle tables to the SAS server and clobbered the WORK library. The view was syntactically correct, but used features that were only available on the SAS server so the query could not be optimized.
The "create as view" is a SAS METADATA definition. You are registering an existing physical table in your METADATA (which is only a DESCRIPTION of a physical object).
It's important that you understand the difference between a SAS metadata definition and a physical object.
You are using SAS OLAP Studio to define a SAS Cube SAS Metadata. SAS then uses these SAS metadata definitions to generate actual SAS code to be executed and build the physical cube (which is "real" physical data). Your Oracle table is the data source so the code generated (and executed) will only READ from source. "Create as view" is only of relevance when WRITING to a table and the table needs to be CREATED.
Not only SAS OLAP Studio uses SAS Metadata - and the Metadata within a Foundation repostitory in a SAS Metadata Server instance is shared by all SAS clients. So let's say there is also some ETL process populating this Oracle table and this ETL process is implemented using SAS DI Studio (which is the specialised SAS tool for ETL jobs).
You would then define a DI Job in SAS Metadata using SAS DI Studio for this ETL and when you generate actual SAS code out of this DI Job Metadata definition then depending on the table definition the SAS code generated will either be "create TABLE" or if "create as view" had been defined for the target table it will be "create VIEW".
Below just for illustrating what I'm talking about:
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.
SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.
Find more tutorials on the SAS Users YouTube channel.