If you missed the Ask the Expert session on Preparing Data in Hadoop for Analysis and Reporting, then you can still view
it on-demand at any time.
This session reviews how business users, data analysts and scientists can manage, manipulate and cleanse data stored in Hadoop using an intuitive browser interface without any specialized coding skills.
You’ll learn how to:
Here is a transcript of the Q&A segment held at the end of the session for ease of reference:
How is data preparation for analytics different than traditional data preparation methodologies?
Depending on the analytical method being used, you may need to transpose data into a one-row-per subject table or join data into a one-to-many table.
Does SAS® Data Loader for Hadoop leverage the Hadoop Spark in-memory framework?
Yes, users can run data cleansing and transformation processes in Spark via Data Loader.
What programming skills are required using SAS ®Data Loader for Hadoop to prepare data for analytics and reporting?
None, Data Loader is a user friendly wizard driven application that requires no coding skills.
Can I leverage exist SAS code and HiveQL code using SAS Data Loader for Hadoop?
Yes, using the Run a SAS program or Run a Hadoop SQL directive.
Can I load Hadoop data into the SAS® LASR Server to drive SAS® Visual Analytics and Statistics?
Absolutely. You can load data, in parallel, directly from Hadoop into SAS LASR.
Can I load relational database tables into HDFS using Hadoop SQOOP with SAS® Data Loader for Hadoop?
Yes, the Copy Data to and Copy Data From directives will leverage any database configured in SAS Metadata with a JDBC connection.
Can other SAS solutions, like SAS® Data Integration Studio, leverage the SAS® Data Loader for Hadoop directives in its ETL flows?
Yes. Directives can be saved into SAS folders in metadata and leveraged in SAS Data Integration Studio via several directive transformations.
Do you a free version that we can use for learning?
No, there is no free version available that I am aware of. Contact your SAS account representative to see if they might be able to setup something up for you.
How to do update/insert in hadoop, when new data come in? Not recreate the whole table.
Hive 0.14 supports update/insert, but you would need to write custom code and use the Run a SAS program directive or Run a Hadoop SQL program.
What is the difference between SAS/ACCESS to Hadoop and SAS data loader to Hadoop?
Data Loader leverages the capabilities of the SAS/ACCESS to Hadoop solution, but does not require any SAS coding.
How much the SAS Data Load will cost?
I do not know. Contact your SAS Account Rep for pricing.
Can we use dataload for AWS ?
Yes, Data Loader can be installed in AWS, just like SAS 9.4M4.
Recommended Resources
Course: Introduction to SAS and Hadoop
Course: Working with SAS Data Loader for Hadoop
Course: Hadoop Data Management with Hive, Pig, and SAS
Want more tips? Be sure to subscribe to the Ask the Expert Community Library to receive follow up Q/A, slides and recordings from other SAS Ask the Expert webinars. From Ask the Expert Library, just click Subscribe from the orange bar underneath the list of the recent articles.
NOTE: For best results when opening the attached slides, click on the “download” icon.
How to convert chart column to date or numeric column, when the chart column is for example 2019-11-05 (05-NOV-2019). I have tried many of the tips on the website but they do not work. There are 38 000 rows.
Juha Nyman
Hi Juha,
If you have value like 2019-11-05 as a character data type in a Hive table in Hadoop, you can use the Transform Data directive in SAS Data Loader for Hadoop. In the Manage Columns section of that directive, add a new column, assign it a Type of DATE and add the following expression:
to_date(inputn(datevar, 'yymmdd10.'))
The INPUTN function converts character values to numeric using a numeric informat. The informat of yymmdd10. will read a character date of the form you 2019-11-05 to a SAS numeric date. SAS dates are stored as numeric doubles. The to_date function is then able to convert the SAS numeric date to the ANSI DATE data type that is used in Hive tables.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Your Home for Learning SAS
SAS Academic Software
SAS Learning Report Newsletter
SAS Tech Report Newsletter