09-27-2023
lisad_sas
SAS Employee
Member since
08-07-2012
- 12 Posts
- 3 Likes Given
- 1 Solutions
- 2 Likes Received
-
Latest posts by lisad_sas
Subject Views Posted 4947 11-15-2018 11:43 AM 4302 04-25-2017 12:03 PM 4349 04-25-2017 11:42 AM 1338 01-17-2017 11:13 AM 2816 01-17-2017 10:32 AM 3056 11-05-2015 03:12 PM -
Activity Feed for lisad_sas
- Liked A Giraffe Taught me Kubernetes! for Ali_Aiello. 10-29-2020 04:22 PM
- Posted Re: Global airports - connecting the world on SAS Visual Analytics Gallery. 11-15-2018 11:43 AM
- Posted Re: SAS DI Tool on SAS Data Management. 04-25-2017 12:03 PM
- Posted Re: SAS DI Tool on SAS Data Management. 04-25-2017 11:42 AM
- Posted Re: Failed to generate matchcode on SAS Data Management. 01-17-2017 11:13 AM
- Posted Re: Proc dqmatch procedure on SAS Data Management. 01-17-2017 10:32 AM
- Got a Like for SAS/Access to Hadoop, not your parents' access engine. 11-12-2015 09:30 AM
- Posted SAS/Access to Hadoop, not your parents' access engine on SAS Communities Library. 11-05-2015 03:12 PM
-
Posts I Liked
Subject Likes Author Latest Post 13 -
My Liked Posts
Subject Likes Posted 1 11-05-2015 03:12 PM -
My Library Contributions
Subject Likes Author Latest Post 1
11-15-2018
11:43 AM
This is VERY cool!!! thanks for sharing it? How are you coming along with version 2?
... View more
04-25-2017
12:03 PM
Unfortunately SAS University Edition is the only such tool offered in that manner.
Lisa
... View more
04-25-2017
11:42 AM
Here's a link to the available training courses for SAS Data Integration Studio https://support.sas.com/edu/prodcourses.html?code=DIS&ctry=US. There are several available. Additionally this link will show you suggested learning paths http://support.sas.com/training/us/paths/dmgt.html.
Hope this helps!
... View more
01-17-2017
11:13 AM
See other response? https://communities.sas.com/t5/SAS-Data-Management/Proc-dqmatch-procedure/m-p/325299#U325299
... View more
01-17-2017
10:32 AM
Hi, I see that you are pointing to the sample Quality Knowledge Base.
E:\SAS\SASHome\SASFoundation\9.4\dqual ity\sasmisc\QltyKB\sample
You will need to point to the ENIND QKB that you installed.
... View more
11-05-2015
03:12 PM
1 Like
SAS has a long history with accessing data, from mainframe data to RDBMS data to PC file data. You name it and SAS can pretty much access it, so of course we have an access engine for Hadoop. If you are familiar with SAS/Access engines, don’t let the name mislead you. SAS/Access to Hadoop is NOT your average SAS/Access engine, mostly because Hadoop is not your average data environment. To help you understand this a little better I’ve pointed out a few things to consider below to tie into the SAS Data Management for Hadoop article series.
Hadoop is OPEN SOURCE
At it’s origination, Hadoop was an open source project from Apache. The main idea behind the project was to enable high speed searching of a wide variety of files to support search engines like Google and Yahoo. Through inexpensive storage and distributed processing, high-performance search was enabled. Since the boom of Big Data, several companies have come to the forefront and established their own distributions of Hadoop including, now familiar names, like Cloudera, Hortonworks, IBM BigInsights, MapR, and Pivotal. Companies like Teradata and Oracle have partnered with these organizations to also offer a Hadoop solution option for their customers.
Hadoop is a distributed FILE SYSTEM
Traditional RDBMS and the like are focused on DATA, not files. SAS/Access technology has a long history of partnering with these technology vendors to understand data types, flavors of SQL, and database-specific utilities. In this way, SAS technology can take advantage of efficiencies like SQL pushdown, bulkloading and even pushing SAS procedures to the databases for processing – thus limiting the impact of data movement on performance.
Hadoop has JARs…lots and lots of JARs
JARs, get used to them, and they aren’t for canning fruits and veggies like your mom or grandmother did. Unlike an RDBMS access engine, where SAS needs the database client installed in order to communicate with the database, Hadoop has .jar files and .xml files. SAS requires a set of .jar files and configuration .xml files in order to communicate with Hadoop, enabling things like HIVEQL pushdown and base SAS procedure pushdown. These files can change or move with each release of a distribution. SAS/Access to Hadoop needs to stay in sync with these files.
Hadoop is YOUNG
If you look around you can easily find seasoned, experienced DBAs as well as mature, stable database systems. RDBMS have gone through their growing pains and have been supporting organizations for decades. Hadoop is YOUNG, making its debut in 2005. As such, things are changing fast.
One final note: Before you upgrade your Hadoop environments, be sure to double check and file location or content change with your distribution!
Follow the Data Management section of the SAS Communities Library (Click Subscribe in the pink-shaded bar of the section) for more articles on how SAS Data Management works with Hadoop. Here are links to other posts in the series for reference:
How to persist native SAS data sets to Hadoop (Hive)
Give analysts 80% of their time back
How to leverage the Hadoop Distributed File System using the SAS Scalable Performance Data Engine
How to leverage the Hadoop Distributed File System using the SAS Scalable Performance Data Server
How to create SAS Scalable Performance Data Engine Tables on the Hadoop Distributed File System
Your data is in Hadoop. Now what?
SAS HADOOP procedure: Managing data in Hadoop is the first order
... View more
- Find more articles tagged with:
- SAS Data Management for Hadoop
Labels: