About DWarner

DWarner · ‎09-01-2016

SAS stores calendar dates as numeric values. That is, a date is stored as a value that represents the number of days since the calendar date January 1, 1960. For example, the value for January 1, 1960 is 0. The value for January 1, 1961 is 366. SAS converts date values back and forth between calendar dates with SAS informats and formats. If you do a PROC CONTENTS on your data set, you will probably see that the variable is a numeric with an informat applied to it. You should be able to perform calculations on the column.

DWarner · ‎04-22-2016

My apologies. You specifically said that you were running SAS 9.4 maintenance 2. WHERE processing optimization using MapReduce is available in maintenance 2. However, maintenance 3 expanded optimized WHERE processing to include more operators and compound expressions.

DWarner · ‎04-22-2016

For the SAS 9.4 SPD Engine (maintenance 3), to optimize WHERE processing, you can request that data subsetting be performed in the Hadoop cluster, which might improve performance by taking advantage of the filtering and ordering capabilities of MapReduce. See this documentation: https://support.sas.com/documentation/cdl/en/engspdehdfsug/67948/HTML/default/viewer.htm#n1bihxl1et1q3tn168txx8suqd4a.htm Regarding I/O operation performance, consider setting a different SPD Engine I/O block size. See this brief topic with links to the IOBLOCKSIZE= options: https://support.sas.com/documentation/cdl/en/engspdehdfsug/67948/HTML/default/viewer.htm#p0qrdzr7ag5221n15n9d1qdb5f8k.htm For the complete documentation for using the SPD Engine to store data in a Hadoop cluster through HDFS, see: https://support.sas.com/documentation/cdl/en/engspdehdfsug/67948/HTML/default/viewer.htm#titlepage.htm

DWarner · ‎02-28-2016

I had to run this question through development, and you are absolutely correct. The documentation is wrong. Thank you for pointing out the error, and I will have the documentation corrected

DWarner · ‎11-19-2015

With just Base SAS, you can use the FILENAME statement to access data in HDFS, PROC HADOOP to interact with Hadoop data by running Apache Hadoop code, and the SPD Engine to write data, retrieve data, perform administrative functions and even update data in HDFS. The version of SAS 9.4 determines which Hadoop distributions are supported. This site lists the SAS 9.4 supported Hadoop distributions for several SAS products, offerings, and technologies, including the Base SAS FILENAME statement, PROC HADOOP, and the SPD Engine: SAS 9.4 Supported Hadoop Distributions If you would like overview information for SAS and Hadoop technologies, use this document. Each overview tells you what the product is, what's required, and where to go for more detailed documentation. SAS and Hadoop Technology: Overview

DWarner · ‎11-17-2015

In the third maintenance release for SAS 9.4, WHERE processing optimization is expanded. Using the Base SAS SPD Engine with Hadoop, you can request that data subsetting be performed in the Hadoop cluster, which takes advantage of the filtering and ordering capabilities of the MapReduce framework. As a result, only the subset of data is returned to the SAS client. By default, data subsetting is performed by the SPD Engine on the SAS client. To request that data subsetting be performed in the Hadoop cluster, you must specify the ACCELWHERE= LIBNAME statement or the ACCELWHERE= data set option. WHERE processing optimization supports the following syntax: comparison operators such as EQ (=), NE (^=), GT (>), LT (<), GE (>=), LE (<=) IN operator full bounded range condition, such as where 500 <= empnum <= 1000; BETWEEN-AND operator, such as where empnum between 500 and 1000; compound expressions using the logical operators AND, OR, and NOT, such as where skill = 'java' or years = 4; parentheses to control the order of evaluation, such as where (product='GRAPH' or product='STAT') and country='Canada'; For the complete documentation about WHERE processing optimization and the data set and SAS code requirements, see WHERE Processing Optimization with MapReduce.

DWarner · ‎07-23-2015

Chris, I apologize for the delay in getting answers to your questions. Hopefully, I can clear up a few things here as well as do some rework in the "Understanding SAS Indexes" documentation. Regarding the SQL view: Only a view created by SQL can be used to optimize an SQL WHERE clause. The example you provided does not seem to contradict this. We will address the wording issues in the chart that lists the WHERE conditions that can be optimized. We need to clarify a misunderstanding of compound optimization, which involves using more than one of the columns indexed in a composite index. All of the examples use only the first column in a composite index, which is not compound optimization. I will also change the heading of this article to include the BASE (V9) engine, as well as the documentation. Thank you for your input and questions.

DWarner · ‎07-15-2015

An index is an optional file that you can create for a SAS data set using the BASE (V9) engine in order to provide direct access to specific observations. The index stores values in ascending value order for a specific variable or variables and includes information as to the location of those values within observations in the data file. In other words, an index enables you to locate an observation by value. For a complete description of SAS indexes - including benefits, types, guidance on deciding whether to create an index, and information about how to create and use indexes, see this article in the SAS 9.4 Language Reference: Concepts, Fifth Edition: https://support.sas.com/documentation/cdl/en/lrcon/68089/HTML/default/viewer.htm#n06cy7dznbx6gen1q9mat8de6rdq.htm

Online Status	Offline
Date Last Visited	‎11-18-2016 04:23 PM

Re: How to convert date to numeric

Re: SPDE on HDFS vs Hive: Performance.

Re: SPDE on HDFS vs Hive: Performance.

Re: Understanding SAS Indexes for the BASE (V9) Engine

Re: minimum requirement to run sas with hadoop

Re: SPD Engine - Hadoop

Re: Understanding SAS Indexes for the BASE (V9) Engine

Understanding SAS Indexes for the BASE (V9) Engine

Understanding SAS Indexes for the BASE (V9) Engine

Understanding SAS Indexes for the BASE (V9) Engine

Re: Understanding SAS Indexes for the BASE (V9) Engine

Re: How to convert date to numeric

Re: SPDE on HDFS vs Hive: Performance.

Re: SPDE on HDFS vs Hive: Performance.

Re: Understanding SAS Indexes for the BASE (V9) Engine

Re: minimum requirement to run sas with hadoop

Re: SPD Engine - Hadoop

Re: Understanding SAS Indexes for the BASE (V9) Engine

Understanding SAS Indexes for the BASE (V9) Engine