Hi,
Trying understand a little about VA under the covers (haven't had my hands on it yet).
If I use VA to make a plot, is there old school SAS code executing in the background on the server (SAS/GRAPH or GTL or whatever)? When I view a table, is PROC REPORT running?
Knowing how to write and read SAS code is helpful if you will be developing stored processes (which are SAS code), or DI jobs (SAS code generated by DIS or user written), etc. At one point I heard someone reassure an old SAS programmer worried about SAS's shift toward marketing "solutions" by saying, "no worries, stored process web app, DI studio, it's all just SAS code underneath."
Is there SAS code underneath the covers of VA in the same way?
If I were a VA developer, could I create the equivalent of a "user-written module", i.e. write a parameterized PROC GPLOT step and make that available to analysts to use in their VA reports?
Obviously if you have VA you can use SAS for the back end ETL. But wondering if the reporting / visualizations are done with SAS code. And if it is possible to integrate user written SAS code with VA.
Just curious.
-Q.
Hi Quentin,
There is no old school SAS code behind VA, with the exception of VDB which generates SAS code to load/prepare data.
Cheers,
Justin
As I understand it, the VA application reports are not Base/GRAPH procedures, the report generation are packaged into the application.
But, you can still use existing knowledge of SAS programing/reporting, by using a LIBNAME towards the SAS LASR Server, i.e. to be used in Stored Procedures.
So if I have a stored process that uses a LIBNAME pointing toward a LASR server, how much would that buy me?
Assume the stored process itself would not be running on the LASR server?
Looks like if I run an HP procedure, maybe I get a lot of benefit.
But if I run PROC REPORT, assume all of the records would have to go from the LASR server to the stored process server. So maybe the savings would be instead of disk-to-disk transfer, it's a memory-to-disk transfer. And if I run a data step, it's also going to write it to disk on the stored process server.
Guess if I run a reporting proc on a small subset from a big dataset, I could get a big benefit (since the WHERE statement would be applied on the LASR server). But if I run a proc on a full dataset, maybe I won't get much benefit from the LASR server?
--Q.
I havne't been able to test the different scenarios your are exemplifying.
What you definitely will gain is getting rid of I/O bottlenecks, so reporting on large tables will probably benefit from it.
Be aware, it seems only PROC IMSTAT and some specificalyy desinged data steps will take place in som part on the LASR server. For other PROCs and data steps ALL data will be transfered back to the client session, e.g. the stored process server. My advice is to read the LASR ref guide carefully before you deploy any StP on LASR data.
Applying where-clauses when appropriate is always best practice, so I think using that to compare isn't saying much. Even a large table on disk can be accessed rapidly if it's indexed in a proper way. And, the big thing with VA/LASR is the great performance on table scans. The impact on analyze smaller subsets is less.
You are right that the stored process server is not executing inside a LASR process, physically, the stored process server can execute on one of the nodes of a VA cluster (or on the server if a non distributed deployment).
does PROC MEANS not take advantage of the LASR ?
As stated, haven't seen this with my own eyes, but quote from ref doc;
"When programming with SAS LASR Analytic Server, it is important to understand where the computation occurs and memory utilization.
•The IMSTAT procedure always performs the computation in the server, and the analysis is performed against the original in-memory table.
•Other procedures (for example, FREQ, UNIVARIATE, and RANK) transfer the in-memory table to the client machine. After the transfer, the session on the client machine performs the analysis on the copy of the data.
•Most DATA step programs operate by transferring the in-memory table to the client and then performing the computation. However, if a DATA step is written to use in-memory tables for input and output, the DATA step can run in-memory, with restrictions. The next section describes how to use this feature."
My interpretation of this is that the client session does all the work, no mention of where in this context.
I note however, that data step that is to executed within LASR, cannot NOT contain WHERE and BY among various other statements...
Peter,
PROC MEANS (and all of the traditional procedures) do NOT take advantage of the LASR server. In order to do the work in LASR, you need to use PROC IMSTAT.
If you use the SASIOLA library engine to access data resident in a LASR server from a SAS session, be careful. If you run something like PROC MEANS against LASR data, SAS will pull all of the data out of LASR back to the SAS server to do the calculations. That might worth for a small data table, albeit somewhat inefficiently. Remember, however, that LASR is capable of holding tables in excess of several terabytes in memory-- attempting to perform a traditional SAS calculation, such a PROC MEANS on a a really large LASR table, you may swamp your SAS server.
But that is why we created LASR. SAS wasn't able to scale to perform fast calculations on billions of rows of data, so we needed a different approach.
Hope that clarifies thing,
David.
Well, I'm confused.
I looked up IMSTAT (a rare, for me, look into the VA docs, specifically SAS Lasr Analytic Server 2.2).
First I thought "IMSTAT is for dealing with tables, it's not a reporting PROC".
Then I stumbled across an example where they used the SUMMARY statement very much like PROC MEANS:
proc imstat; table lasr.prdsale; summary actual predict / partition; run;
So I looked for documentation on the SUMMARY statement, among the statements documented for PROC IMSTAT, and couldn't find it.
I suppose SUMMARY is perhaps a statement that could apply to other procs (like WHERE statement in SAS), but is the SUMMARY statement documented somewhere I'm missing?
Thanks,
-Q.
The best entry that is open is: SAS Visual Analytics You will notice that the related information is the intelligence platform.
That is what is happening:
- moving uhhh eliminating the Cube-dwh building by replacing it with in memory analytics
- Offering the presentation by a web-interface and as you like to mobiles.
Carefully reading the manual SAS(R) Visual Analytics 6.3: User's Guide (Preprocess and Postprocess Code) gives some hits and show it is SAS running there.
The technical installation manuals are not free. The system requirements are at: SAS Visual Analytics | SAS
SAS VA 6.3 is running on top of SAS 9.4. The OS environments (Linux) are needed because buidling on OS specific features like cgroup use for loadbalancing.
The principal desing is a central metadata-server with a possible lot of hubs.
Yes it is still SAS you probably can connect it wiht AMO EGUIDE or SMC or DI. Not very clear docuemented... but its is EIP.
SAS(R) Visual Analytics 6.3: User's Guide (Exporting Data Queries as Jobs) a hit.
But remember DI does not have any meaning for building cubes anymore. That is history and should be forgotten.
Hi Quentin,
There is no old school SAS code behind VA, with the exception of VDB which generates SAS code to load/prepare data.
Cheers,
Justin
Hi,
I would like to extract the VDB code from metadata. Is this possible via metadata query? Where is the code for the VDB-query stored?
Thanks,
Regards,
Berry
Hi @bheerschop, you're more likely to get a reply if you start a new topic on the VA forum than appending a question to an older discussion. Hope you'll consider introducing this as a new question.
Thanks Justin , David,
More questions popping up than answers.
To recap ro understand it all:
- The in memory analytics processing has been rebuild into a new one: IMSTAT (and several more).
The reason is that converting the old existing procedures like Means (SQL?)-- it would take to much time to develop.
- As of the "ïn memory analytics processing" the executing in memory proces (service) is not isolated from the requestor(client),
There has no service oriented architecture (SOA) has been implemented for this. There is a need to isolate the old SAS procs not running in this environmet.
The questions popping up:
- As there is no isolation in processes you are trying to solve the concerns about security in a programmatic way.
IMO that ia a dead end. Just follow the issues about virus-scanners, running at root-level like PHP usage SQL injection and more (sans). When you are really concerned about security it should be a fundamental part of the design.
- "Not able to solve it with changing the old proc's" Are you saying they are really that bad programmed it cannot be done?
This is rising other questions about reliablity and future of that all. Now it is very dedicated SAS-approach.
Wondering how SAP-Hana will fit into this or is that intended cooperation a different direction, to be designed from scratch.
The promiss with "in-database processing", as I have seen some issues (not correct behavior) on that part with the old procedures that could become an other big challenge. As there is still a lot of work to do on that side. Many big SQL suppliers (Oracle Microsoft Teradata) are adding in their dialects analytics options. It is embarassing when those options are not easy used (procs) from SAS.
When I read Gartner 2014 magic quadrant for Business Intelligence and Analytics Platforms. The Cautions the are mentioning at SAS are the same as my concerns. At the same time a the Strength part VA is mentioned the user experience and a strong vision "to go forward" . Nothing wrong with that. We will see how it will go on in the future.
You seem to have jumped to some incorrect conclusions...
"As of the 'ïn memory analytics processing' the executing in memory proces (service) is not isolated from the requestor(client),
There has no service oriented architecture (SOA) has been implemented for this."
The LASR Analytic Server is completely isolated from the web application server on the midtier. The midtier makes requests to the LASR server based on user actions and the LASR server performs the analytics and returns the results. There is well defined SOA in play.
"There is a need to isolate the old SAS procs not running in this environmet."
I am not sure what you mean by this statement. Old SAS procs run in a workspace server, not on the LASR server, so they are isolated. There is nothing preventing the the old procs connecting to LASR via the SASIOLA library engine.
"As there is no isolation in processes you are trying to solve the concerns about security in a programmatic way.
IMO that ia a dead end. Just follow the issues about virus-scanners, running at root-level like PHP usage SQL injection and more (sans). When you are really concerned about security it should be a fundamental part of the design."
There is complete isolation of processes, allowing the separation of the various parts on separate machines. Security has been a primary architecture consideration from the beginning and I assure you SQL injection attacks aren't possible. HTTP connections can be protected via the use of HTTPS and various tiers can be isolated via firewalled zones.
In the case of SAP-Hana and other foreign databases with proprietary implementations and SQL dialects, integration is always difficult, but SAS supports explicit SQL pass-through to facilitate the usage of those third-party features. But most of those third-parties don't provide the analytic capabilities of SAS or support those operations against big data.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.
Find more tutorials on the SAS Users YouTube channel.