What’s new in SAS Data Loader for Hadoop 2.4

1 Like

SAS Data Loader for Hadoop 2.4, generally available on Monday, January 11, 2016, includes features that seek to achieve three goals:

Speed up data management processes with Spark
Improve productivity of data management professionals
Manage data where it lives

Below is a summary of what’s new in the 2.4 release. For more details, please see the SAS Data Loader for Hadoop 2.4 User’s Guide.

Speed up data management processes with Spark

Improved performance using Spark and Impala - New support for Spark brings massively parallel in-memory processing to the following directives: Cleanse Data, Transform Data and Cluster-Survive. Impala can now be leveraged in the following directives: Query or Join, Sort and De-Duplicate and Run a Hadoop SQL Program (formerly called “Run a Hive Program”).
Increased performance of profiling jobs

Improve productivity of data management professionals

Improved syntax editing
Chain directives - Create a data flow that uses two or more saved directives which can be executed in serial or in parallel.
New “Match-Merge” directive - Use the new “Match-Merge” directive to append columns from multiple source tables into a single target table. Column data values can also be updated when rows match in two or more source tables.
New “Cluster-Survive” directive - The new “Cluster-Survive” directive leverages user-defined rules to create clusters of similar records. Additional user-defined rules can be created to construct a survivor record that will replace the cluster of rows in the target table.
New “Delete Rows” directive

Manage data where it lives

Added support for IBM BigInsights and Pivotal HD
Expanded support to now include VirtualBox and VMWare Hypervisors
Schedule jobs using a REST API - A REST API can now be used to schedule and execute saved directives. The API can also return the job’s state, results, log file or error messages, along with being able to cancel running jobs and delete job information.
Apply and reload Hadoop configuration changes

New trial version

Download a free trial version of SAS Data Loader for Hadoop, to be installed on a production Hadoop cluster. This can be converted into a production license without reinstalling the software.

Jakub_Chovanec · ‎02-01-2016

Hi,

when I download DL trial, it is the old one - DL 2.2. I am doing something wrong? Is it possible to make things work with downloading DL 2.4 vApp Cloudera and Cloudera Quickstart VM 5.3?