If you missed the Ask the Expert session on Preparing Data in Hadoop for Analysis and Reporting, then you can still view
it on-demand at any time.
I have also attached the slides and demonstration steps used during the presentation that showed the following:
Here is a transcript of the Q&A segment held at the end of the session for ease of reference:
Q: How is data preparation for analytics different than traditional data preparation methodologies?
A: Depending on the analytical method being used, you may need to transpose data into a one-row-per subject table or join data into a one-to-many table.
Q: Does SAS ® Data Loader for Hadoop perform its data preparation in-database using MapReduce?
A: Absolutely, the SAS In-Database technologies leverage the power of MapReduce for distributed, parallel processing across the Hadoop cluster.
Q: Does SAS® Data Loader for Hadoop leverage the Hadoop Spark in-memory framework?
A: Yes, users can run data cleansing and transformation processes in Spark via Data Loader.
Q: What programming skills are required using SAS ®Data Loader for Hadoop to prepare data for analytics and reporting?
A: None, Data Loader is a user friendly wizard driven application that requires no coding skills.
Q: Can I leverage exist SAS code and HiveQL code using SAS Data Loader for Hadoop?
A: Yes, using the Run a SAS program or Run a Hadoop SQL directive.
Q: Can I load Hadoop data into the SAS® LASR Server to drive SAS® Visual Analytics and Statistics?
A: Absolutely. You can load data, in parallel, directly from Hadoop into SAS LASR.
Q: Can I load relational database tables into HDFS using Hadoop SQOOP with SAS® Data Loader for Hadoop?
A: Yes, the Copy Data to and Copy Data From directives will leverage any database configured in SAS Metadata with a JDBC connection.
Q: Can other SAS solutions, like SAS® Data Integration Studio, leverage the SAS® Data Loader for Hadoop directives in its ETL flows?
A: Yes. Directives can be saved into SAS folders in metadata and leveraged in SAS Data Integration Studio via several directive transformations.
Q: Do you a free version that we can use for learning?
A: No, there is no free version available that I am aware of. Contact your SAS account representative to see if they might be able to setup something up for you.
Q: How to do update/insert in hadoop, when new data come in? Not recreate the whole table.
A: Hive 0.14 supports update/insert, but you would need to write custom code and use the Run a SAS program directive or Run a Hadoop SQL program.
Q: Can you send us the slides of this presentation?
A: Yes, they are attached.
Q: What is the difference between SAS/ACCESS to Hadoop and SAS data loader to Hadoop
A: Data Loader leverages the capabilities of the SAS/ACCESS to Hadoop solution, but does not require any SAS coding.
Q: How much the SAS Data Load will cost?
A: I do not know. Contact your SAS Account Rep for pricing.
Q: Will we get the slide desk?
A: Yes, attached.
Q: Can we use dataload for AWS ?
A: Yes, Data Loader can be installed in AWS, just like SAS 9.4M4.
Want more tips? Be sure to subscribe to the Ask the Expert Community Library to receive follow up Q/A, slides and recordings from other SAS Ask the Expert webinars. From Ask the Expert Library, just click Subscribe from the orange bar underneath the list of the recent articles.
NOTE: For best results when opening the attached slides, click on the “download” icon.