11-06-2017 11:40 AM - edited 11-27-2017 08:36 AM
Hi Data Management Community,
If you missed the Ask the Expert session: Preparing Data in Hadoop for Analysis and Reporting, then you can still view it on-demand at any time.
This session reviews how business users, data analysts and scientists can manage, manipulate and cleanse data stored in Hadoop without transferring the data across the network using an intuitive browser interface without any specialized coding skills. The SAS In-Database solutions such as the SAS Data Quality Accelerator and SAS Code Accelerator provide technology to process the data where the data resides in Hadoop.
During the session, learn:
Here is a transcript of the Q&A segment at the end of the session, along with additional questions that came in afterwards.
Q: How is data preparation for analytics different than traditional data preparation methodologies?
A: Depending on the analytical method being used, you may need to transpose data into a one-row-per subject table or join data into a one-to-many table.
Q: Does SAS ® Data Loader for Hadoop perform its data preparation in-database using MapReduce?
A: Absolutely, the SAS In-Database technologies leverage the power of MapReduce for distributed, parallel processing across the Hadoop cluster.
Q: Does SAS® Data Loader for Hadoop leverage the Hadoop Spark in-memory framework?
A: Yes, users can run data cleansing and transformation processes in Spark via Data Loader.
Q: What programming skills are required using SAS ®Data Loader for Hadoop to prepare data for analytics and reporting?
A: None, Data Loader is a user friendly wizard driven application that requires no coding skills.
Q: Can I leverage exist SAS code and HiveQL code using SAS Data Loader for Hadoop?
A: Yes, using the Run a SAS program or Run a Hadoop SQL directive.
Q: Can I load Hadoop data into the SAS® LASR Server to drive SAS® Visual Analytics and Statistics?
A: Absolutely. You can load data, in parallel, directly from Hadoop into SAS LASR.
Q: Can I load relational database tables into HDFS using Hadoop SQOOP with SAS® Data Loader for Hadoop?
A: Yes, the Copy Data to and Copy Data From directives will leverage any database configured in SAS Metadata with a JDBC connection.
Q: Can other SAS solutions, like SAS® Data Integration Studio, leverage the SAS® Data Loader for Hadoop directives in its ETL flows?
A: Yes. Directives can be saved into SAS folders in metadata and leveraged in SAS Data Integration Studio via several directive transformations.
Q: How to do update/insert in hadoop, when new data come in? Not recreate the whole table.
A: Hive 0.14 supports update/insert, but you would need to write custom code and use the Run a SAS program directive or Run a Hadoop SQL program.
Q: Can you send us the slides of this presentation?
A: Yes, they are attached.
Q: What is the difference between SAS/ACCESS to Hadoop and SAS data loader to Hadoop
A: Data Loader leverages the capabilities of the SAS/ACCESS to Hadoop solution, but does not require any SAS coding.
Q: Will we get the slide desk?
A: Yes, attached.
Q: Can we use dataload for Amazon Web Services (AWS)?
A: Yes, Data Loader can be installed in AWS, just like SAS 9.4M4.
Want more tips? Be sure to subscribe to the Ask the Expert Community Library to receive follow up Q/A, slides and recordings from other SAS Ask the Expert webinars. From Ask the Expert Library, just click Subscribe from the orange bar underneath the list of the recent articles.
NOTE: For best results when opening the attached slides, click on the “download” icon.