How to Query a Table in SAS VA (Viya 4) Without Loading to CAS?

Mahis · Posted 02-16-2025 04:39 AM

Hello everyone,

I'm working with SAS VA on SAS Viya 4, and I need to create a report that queries data directly from Hive without loading the entire table into CAS.

The table is very large grows daily and is partitioned by year, month, and day. I want users to be able to select parameters (year, month, day) in the report and the query should retrieve only the relevant partitions instead of scanning the whole table.

If anyone has worked on a similar case, could you please share the steps to achieve this? Thanks.

SASKiwi · Posted 02-16-2025 06:06 PM

This article gives an example of how to select data (which could be from an external database) that is then loaded into CAS. Your use case looks very similar if my understanding is correct.

Patrick · Posted 02-16-2025 09:47 PM

How large is "very large" in GB?

Report consumers normally expect "immediate" results when using interactive reports. The query and data transfer will require time. Hive queries have latency and are better suited for batch processing. Ideally use Impala for interactive processes.

Would it be an option to pre-load aggregated data for all years into CAS and create reports with links for drill through if required. The drill through then can query the hive table for the leaf where a user wants to see the detail.

Mahis · Posted 02-17-2025 04:03 PM

Currently the table is 1TB in size and approximately 500MB is added daily.

We need to create different reports using various tables as data sources. I believe CAS will not be able to load the historical data for each table which has accumulated over 5+ years and reached almost 1TB. That's why I am looking for a way to query the data directly from Hive.

Patrick · Posted 02-17-2025 11:53 PM

@Mahis

There are various articles in the SAS Community library that demonstrate how you can share a job with user selections (prompts) for retrieving the data and creating a report on-the-fly in VA.

Below the links to two of these articles:

Sharing Jobs Using SAS Visual Analytics

Using Jobs to Load Data in SAS Visual Analytics

The first link covers static reports that won't need any data in CAS, the 2nd link is for interactive reports where the job loads first data into CAS (I believe for your use case that would need to be session scope).

I'm still concerned about response times not meeting user expectation and would strongly suggest that you execute one of your queries against Hive out of SAS to determine how long this actually takes.
Is it really not possible to create pre-aggregated tables (eventually one per report) that are much smaller in size but will support your reports? It could be a specialized table per report (or a few similar reports) pre-loaded into CAS via daily batch job.

Whatever you do, ensure that you run Hive queries that minimize the data volumes before transferring to SAS.

Afaik CAS as such can deal with a 1TB table IF you've got an environment that's sized for it (which is unlikely). But with CAS you can certainly load big tables. There isn't only RAM but also virtual memory. Of course once SAS has to swap data between RAM and virtual memory performance will decrease.
You can also compress data in memory (again with a performance penalty) and there is also Duplicate Value Reduction (DVR) that depending on your data can reduce volumes significantly.

If it was me then I'd spend some time to analyse and test:
1. Which data elements from source do I need for my reports?
2. Which level of aggregation is possible to support my reports?
3. How much can I reduce storage requirements by using compression and/or DVR?

I'd likely run some tests for above with a sub-set of the data to get some metrics helping me to make a decision. The ideal outcome would be to reduce the data volumes to a level where a few pre-aggregated tables can support all reports - but of course if the reports require pre-aggregation by different categorical variables then more report specific CAS tables might be required.

I'd only consider using queries against external data sources during user interaction as a last resort because I'm pretty sure that response times won't be satisfying (but yeah, test with some hive queries to confirm).

...and as an afterthought:

Is this 1TB the actual Hive table or the size when you load the data to the SAS side. Hive got this data type of STRING without a length which SAS compute (sas7bdat) maps to CHAR with a length of 32KB (depending on the dbmax_text libname option). When loading into CAS the datatype normally becomes VARCHAR(*) which as such is good for STRING but CAS VARCHAR consumes at least 16 bytes so it's not that great for any character variable that doesn't need that many bytes to store the actual values or where the values are fixed length (use CHAR for such cases).
What this means: When loading high volume data into SAS compute (.sas7bdat) or directly into CAS (.sashdat) then explicitly define the mapping of the HIVE to the SAS/CAS variable types especially if there are STRING data types in Hive.

Mahis · Posted 02-18-2025 02:23 AM

Thank you @Patrick and @SASKiwi for all these details. I will review the articles and test the different solutions.

Dooloverity00 · Posted 06-26-2025 11:39 PM

To query a table in SAS Viya 4 (VA) without loading it into CAS, you can use the Proc FedSQL procedure. This allows you to query data directly from the source (e.g., a database) without first loading it into the in-memory CAS environment.

Dooloverity00 · Posted 06-28-2025 01:46 AM

To query a table in SAS Viya 4 (VA) without loading it into CAS, you can use the Proc FedSQL procedure. This allows you to query data directly from the source (e.g., a database) without first loading it into the in-memory CAS environment.

Palm Beach County Property Appraiser

How to Query a Table in SAS VA (Viya 4) Without Loading to CAS?

Re: How to Query a Table in SAS VA (Viya 4) Without Loading to CAS?

Re: How to Query a Table in SAS VA (Viya 4) Without Loading to CAS?

Re: How to Query a Table in SAS VA (Viya 4) Without Loading to CAS?

Re: How to Query a Table in SAS VA (Viya 4) Without Loading to CAS?

Re: How to Query a Table in SAS VA (Viya 4) Without Loading to CAS?

Re: How to Query a Table in SAS VA (Viya 4) Without Loading to CAS?

Re: How to Query a Table in SAS VA (Viya 4) Without Loading to CAS?

SAS Innovate 2026 Registration is Open