BookmarkSubscribeRSS Feed
Fuccio_Ciro
Fluorite | Level 6

Hi I would like to ask what is the best architectural configuration for SAS:
-> SAS reads tables from a dedicated database (MS sql .. oracle ..)
-> SAS reads its own tables (sas7bdat) on its hard disk

I specify that my company uses SAS guide and Campaing Manager

9 REPLIES 9
Nigel_Pain
Lapis Lazuli | Level 10

It's horses for courses really! Using native SAS tables is likely to be more efficient just because there's  nothing in between SAS and the data. But it depends what your data sources are. We have quite a few in separate databases (SQL Server/Oracle/PostgreSQL) but most users extract from them and save the results locally for analysis. This has the added advantage that, as the sources are operational databases, the source of a specific analysis is always available as a snapshot of the data.

Having said that, our analysts are moving towards having a separate platform for hosting all their analytical data, using SQL Server and Sharepoint. This has metadata built in to its front end, so that it is searchable on different topics. It means the analysts aren't restricted to any single analytical tool. SAS is the main one but we have an increasing number of R users, and there have always been a few people who prefer other platforms such as SPSS, Stata and so on. It's also become quite important because there are a wide range of areas for analysis (eg. health, education, transport, justice), which sometimes have cross-overs between them.

SASKiwi
PROC Star

There is no right answer for this. It all depends on on your own requirements and preferences. No one architecture is necessarily better than any other.

Kurt_Bremser
Super User

For pure SAS processing, I like native SAS dataset files more. Performance is not hindered by network connections to the DB server, and you need no additional maintenance of the DB client on the SAS server.

LinusH
Tourmaline | Level 20

I think that a majority here (including me) doesn't know how Campaign Manager consumes, so a brief description of query patterns might help.

Do you create your own data marts, or even a data warehouse, or just tap in to sources directly?

What data volumes do you have, update frequency etc?

Data never sleeps
Fuccio_Ciro
Fluorite | Level 6

Hi Guys Thanks for the reply. This is very useful information for me.
I can say that we use a data mart on an oracle database. Some tables have more than 15mln of records, others even 300mln.
The problem is that we have to discard the oracle machine to replace it with another technology. And I was thinking about whether it would be better to materialize the tables in SAS (also create indexes ...) or read the live tables on the (new) database below. My idea is to isolate SAS with your own machine resources as much as possible ...

AhmedAl_Attar
Ammonite | Level 13

@Fuccio_Ciro 

Few things to consider when moving off an RDBMS and trying to replace it with SAS -- Typically RDBMS are very good at  

  • Concurrent Read/Write operations
  • Parallel query execution
  • Bitmap & B-tree Indexes
  • Materialized Views
  • Audit trail and Roll-back functionality
  • Table/Column/Row level security
  • Data Encryption at Rest/Motion

Just to name few features and capabilities out of the box provided by the RDBMS. If any of the above capabilities are a must-have for your new platform, then you'll have to look for either

- Cheaper alternative RDBMS to Oracle (Commercial: look into SingleStore ) (Open Source: look into PostgreSQL).

Note: the alternative could be hosted on-Premise or in the Cloud 

OR

- SAS Scalable Performance Data Server (SPDS)

 

To manage your datamart tables and ETL/ELT/Pipelines operations 

 

Hope this helps

LinusH
Tourmaline | Level 20

If there is no clear strategy at your workplace what data store to use, and SAS is the only consumer of this information, why not? But who load thei data amrt with what technology?

 

Apart from @AhmedAl_Attar suggestion, SAS SPDE table is a light weight version of SPD Server, and is included in you Base SAS license. It handles bulk-load with indexes (bitmap-b/tree hybrid) in place good, and of course querying. It's not good at record deletions and mass update. If you need partitioning capabilities, SPD Server or another RDBMS is probably a better choice.

Data never sleeps
Fuccio_Ciro
Fluorite | Level 6

Thanks for the great ideas.

I did not know the SPDE tables and I am studying them.

We don't want to change the SAS server configuration so we can't install an SPD Server...


The need is to use SAS in the final part of the Data flow. 5 users use SAS guide only to read tables and create flows to feed PBI dashboards.

Then they read the tables on oracle and write to MS sql tables (always new).

I would like SAS to start from its own tables, and be efficient in its manipulations.

Other users use the Campaign Manager instead. 

 

Compared to the needs mentioned by  @AhmedAl_Attar  we need:

  • Concurrent Read / Write operations: Read only and write only on different databases (non concurrent writing)
  • Parallel query execution: Yes
  • Bitmap & B-tree Indexes: Probably yes (are they different from the usual indexes?)
  • Materialized Views: Yes
  • Audit trail and Roll-back functionality: No
  • Table / Column / Row level security: No
  • Data Encryption at Rest / Motion: No
 @LinusH Kindly give me some more details on the SPDE tables Thanks!

suga badge.PNGThe SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment. 

Join SUGA 

Get Started with SAS Information Catalog in SAS Viya

SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 3079 views
  • 7 likes
  • 6 in conversation