About LongDrive3Putt

LongDrive3Putt · ‎08-04-2015

Thanks for the info! Given the software we use, I'd say we went straight to the source for good advice! I've passed along the SASFMT details to our IT department and we're working on getting some test data set up so we can play around a bit. In the interim, I've tried to use the DBMAX_TEXT syntax in my libname reference as well as my explicit pass-thru but didn't have any luck. Here's the syntax I used: LIBNAME IMP SASIOIMP SERVER="server" PORT=21050 USER=&HAD_USER PASSWORD=&HAD_PW DATABASE=DEV_SL DBMAX_TEXT=200; PROC SQL; CONNECT TO SASIOIMP (DATABASE=DEV_SL USER=&HAD_USER PASSWORD="&HAD_PW" SERVER="server" SQL_FUNCTIONS=ALL DBMAX_TEXT=200); CREATE TABLE ORDER_HDR_EXP_2 AS SELECT * FROM CONNECTION TO SASIOIMP (SELECT FIELD_Z FROM VIEW_A); QUIT; You had mentioned SASFMT only works on implicit pass-thru queries and not with explicit pass-thru?

LongDrive3Putt · ‎07-31-2015

Greetings! First time caller, long time listener. Was hoping you all could help me with some field length issues we're having and, as a follow on, help determine why certain table properties won't pass on implcit SQL but do on explicit SQL. My company uses Cloudera/Impala (ODBC version 2.5.29, ImpalaD 2.1.3-cdh5.3) to access Hadoop via SAS/ACCESS. We're currently running SAS 9.4 and have the STAT 13.2 and ETS 13.2 analytical packs. Consider a Hadoop View (VIEW_A) that has a field (FIELD_Z) defined as VARCHAR(10). I run the following code: LIBNAME IMP SASIOIMP SERVER="server" PORT=21050 USER=&HAD_USER PASSWORD=&HAD_PW DATABASE=DEV_SL; 68 PROC SQL; 69 CREATE TABLE ORDER_HDR_IMP_1 AS 70 SELECT 71 FIELD_Z 72 FROM VIEW_A; 73 QUIT; NOTE: Compressing data set WORK.ORDER_HDR_IMP_1 decreased size by 99.78 percent. Compressed is 47 pages; un-compressed would require 21144 pages. NOTE: Table WORK.ORDER_HDR_IMP_1 created, with 148002 rows and 1 columns. 89 QUIT; NOTE: PROCEDURE SQL used (Total process time): real time 11.09 seconds cpu time 8.48 seconds When I note the field length of FIELD_Z, it is $32767...even though the view is defined as VARCHAR(10). Now, I try the explicit SQL method: 90 PROC SQL; 91 CONNECT TO SASIOIMP (DATABASE=DEV_SL USER=&HAD_USER PASSWORD="&HAD_PW" 91 ! SERVER="server" SQL_FUNCTIONS=ALL); 92 CREATE TABLE ORDER_HDR_EXP_2 AS 93 SELECT * FROM CONNECTION TO SASIOIMP 94 (SELECT FIELD_Z 95 FROM VIEW_A); 96 QUIT; NOTE: Compression was disabled for data set WORK.ORDER_HDR_EXP_2 because compression overhead would increase the size of the data set. IMPALA_8: Executed: on connection 2 Prepared statement IMPALA_7 NOTE: Table WORK.ORDER_HDR_EXP_2 created, with 148002 rows and 1 columns. 114 QUIT; NOTE: PROCEDURE SQL used (Total process time): real time 1.29 seconds cpu time 0.24 seconds The above results in a proper length of FIELD_Z of $10. Strange. I can note that the query returned the exact same records, but ran in 90% less time and took up 99.99% less space (before compression). We are currently not using the SASFMT metadata option as our IT department is a little.....hesitant.....to add this metadata tag, and I have not be able to find much on the web regarding its usage. Our Hadoop install is pretty new (traditionally a DB2 shop, Teradata now as well), so we're still working out the kinks. Right now, we've got a handful of tables with the largest having about 17M records and 25 columns, relatively small compared to our DB2/Teradata tables. We use ETL to load the Hadoop tables and SAS to perform analytics. However, pulling data from these tables is extremely inefficient due to these field length issues. So a few questions: 1. Why the differences in dataset field length between Implicit SQL and Explicit SQL? 2. Anybody currently using the SASFMT metadata option? If so, any concerns/issues? Is it meeting your requirements? 3. For those not using SASFMT, how do you get around the $32767 default length for STRING fields? Thanks in advance, appreciate the help! Feel free to ask for any additional information and I'm happy to provide whatever I can. Matt-

Online Status	Offline
Date Last Visited	‎09-09-2015 11:04 AM

Re: SAS/Access Hadoop String Columns - Field Length Options

SAS/Access Hadoop String Columns - Field Length Options

Re: SAS/Access Hadoop String Columns - Field Length Options

SAS/Access Hadoop String Columns - Field Length Options