BookmarkSubscribeRSS Feed
dkassis
Calcite | Level 5

Just trying to understand best practices on data management and efficient programming with SAS windows based server and Hadoop.  

 

We have created SAS folders and also SAS libraries that have URI connections to our hadoop tables and databases.  When creating tables/views using SAS code what is the best place to store these tables for others to access?  An example would be creating a SAS program that extracts data from Hadoop, formats it in a certain way, then can be easily accessed by another user using SAS EG or the excel add in for SAS.  

 

Is it better to save the tables in a library or in a SAS folder?  Also, I would want to set it up so that queries will be automated so that the data refreshes every day.  Attached is how it is set up in VA.  The SAS folders are empty right now and my question is what types of things should be saved here, programs, data?   The libraries are how we connect and query the data in hadoop.  

 

Thanks,

Dan

5 REPLIES 5
Kurt_Bremser
Super User

SAS Folders are "virtual" places in the Metadata. Datasets are not stored there, only references to them. By using the Authorization tab of the SAS folders, you can control access to these metadata objects.

Library definitions can also be stored in metadata, but these are only parameters used for libname statements (executed when the library is either pre-assigned or on demand).

The physical location of datasets is determined by the path to which the libname of the library points.

dkassis
Calcite | Level 5

Thanks for the quick reply this is helpful. 

LinusH
Tourmaline | Level 20

A part from the semantics of folders vs libraries,it makes sense to use folders if you ETL the data with DI Studio, or if you have a variaty of users groups with different needs/rights.

 

When it comes to store data physically it's usually "depends".

I guess you are using Hadoop for some reason(s), which are..?

 

Whether to extract data to SAS, or have the user query the data directly might depend on data sizes, user competence, type of queries, data structure etc.

Data never sleeps
dkassis
Calcite | Level 5

Thanks for the reply. 

 

We currently have one set of users that use EG and VA to pull/analyze the data.  Hadoop is used as our platform to hold our large data but I would like to be able to build smaller tables to be hooked up to VA or the SAS add in for easy reporting and automation.  

LinusH
Tourmaline | Level 20

Thanks for the input. But to be able to give viable design suggestions, that's need more comprehensive information than can be shared in a thread, so back to the last section in my previous post.

Data never sleeps

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 3357 views
  • 2 likes
  • 3 in conversation