We’re smarter together. Learn from this collection of community knowledge and add your expertise.

NEW Directives for SAS Data Loader 2.4 for Hadoop

by SAS Employee SusanJ516_sas on ‎02-16-2016 10:41 AM (188 Views)


SAS Data Loader 2.4 includes new directives to help you work more efficiently:

  • Chain directives- runs two or more saved directives in series or in parallel. One chain directive can contain another chain directive. A serial chain can execute a parallel chain, and a parallel chain can execute a serial chain. An individual directive can appear more than once in a serial directive. Results can be viewed for each directive in a chain as soon as those results become available.
  • Cluster-Survive Data directive - uses rules to create clusters of similar rows. Additional rules can be used to construct a survivor row that replaces the cluster of rows in the target. The survivor row combines the best values in the cluster. This directive requires the Apache Spark run-time environment.
  • Match-Merge Data directive - combines columns from multiple source tables into a single target table. You can also merge data in specified columns when rows match in two or more source tables. Columns can match across numeric or character data types.
  • Run a Hadoop SQL Program directive - replaces the former directive Run a Hive Program. The new directive can use either Impala SQL or HiveQL. In the directive, a Resources box lists the functions that are available in the selected SQL environment. Selecting a function displays syntax help. A click moves the selected function into the SQL program. 

    See the SAS Data Loader 2.4 for Hadoop: User's Guide for additional information and examples.
Your turn
Sign In!

Want to write an article? Sign in with your profile.

Looking for the Ask the Expert series? Find it in its new home: communities.sas.com/askexpert.