BookmarkSubscribeRSS Feed
sushil
Calcite | Level 5

Hi

We are working on a project which has SAS VA (7.1) distributed architecture using hadoop. The stack also has SAS DI. My understanding is all data processing, transformation, detailed table creation should happen in SAS DI and is recommended approach. The data is then loaded in LASR server which is consumed by dashboard developer.

Another approach is do the processing, transformation, detailed table creation in SAS VA data builder and load in LASR server. So most of the DI jobs would be done in SAS VA data builder.

Queries -

1. Which approach is recommended and what are the pros and cons.

2. Can DI jobs be created in SAS VA data builder. Is yes are there additional procs that needs to be used specific to hadoop

Regards

so

5 REPLIES 5
AhmedAl_Attar
Rhodochrosite | Level 12

Sushil,

The answer to your question, should come from comparing the listed capabilities/features of each of these products.

SAS DI : SAS Data Integration Studio

SAS VA Data Builder: https://support.sas.com/documentation/cdl/en/vaug/67500/PDF/default/vaug.pdf

The list of features are too long to put in one answer, but I would recommend DI Studio when you have to

- Collaborate amongst multiple developers (Change Management - Check in/out)

- Scheduling Jobs

- Developing User Custom Transformations

Just my 2 cents.

Ahmed

LinusH
Tourmaline | Level 20

Need to know the nature of your SAS/DW environment to give you an adequate answer.

But if you have a multi level DW, I would recommend to use DIS all the way to data marts (star schemas).

The you could use data builder to load selected parts of data mart to LASR.

There are some LASR loaders in DI, I haven't used them, so I can't really tell the pros/cons of this last part.

One Pro could be that you have a single point of metadata (DIS).

Data never sleeps
Kaufmann
Fluorite | Level 6


Hi Sushil;

Data builder is a good tool for doing proof of concept work, but not the right tool for developing production processes.  Specifically, there are two main drawbacks to using the VA data builder.  First, data builder it is not a full functioning ETL tool and is not nearly as powerful as DI. Second, using the data builder tool produces a query that must be executed every time you want to load this table into memory.  This produces both load on your data warehouse and needless network traffic. It is possible that you may wish to swap datasets in and out of memory as necessary and in this case you don't want to have to rebuild the table every time you load it if the data in it hasn't changed (like for a month-end snapshot table).

For this reason I recommend doing all data preparation in DI, culminating with the creation of a single "Analytical Base Table" capable of supporting your analysis task.  This table would be in a "VA Ready State", would be registered in metadata and can be lifted directly into memory without any additional processing.  In this scenario you would use the "Administrator Load" approach outlined on page 14 of the SAS Analytics Server Administration Guide.

I hope this helps.

Shaun.

PS.  DI jobs can't be created in SAS VA data builder.

yhuang
Fluorite | Level 6

Hello Kaufmann,

 

I have exact same question, I use SAS Enterprise Guide to generate data that are VA ready state, but I would like to know a way to keep it dynamic. I've already figured out how to make EG automatically run so the data would be updated, the question is, how do I get the data into VA automatically every time they have been updated or at a scheduled time.

 

Thanks

varsha_sas
SAS Employee

Hi,

 

I would try the SAS Enterprise Guide connection to the Windows scheduler. We tested this on a PC and it worked fine. Please let me know if this helps.


Thanks!

Varsha

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Tips for filtering data sources in SAS Visual Analytics

See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 2056 views
  • 1 like
  • 6 in conversation