Error running script in Hadoop (Apache)

I need to run a query of historical bank transactions, in Hadoop. I usually query in Netezza.


I received the following error, and was wondering if anyone had a tip to offer.


  • ERROR: Data from column 'mrcnt_dba_nm' in row 303856439 of the result set was not presented in Hadoop UTF-8 format. The length of
    this data is 25 bytes, and the first 3 characters are '011'. Adding -JREOPTIONS (-Dfile.encoding=UTF-8) to the SAS
    invocation may circumvent the issue. Otherwise the data should be corrected to UTF-8 format.
    ERROR: PROC SQL runtime error for operation=sqxsrc.
    ERROR: An error has occurred.
  • NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
    ERROR: java.sql.SQLException: org.apache.thrift.transport.TTransportException: org.apache.http.NoHttpResponseException: failed to respond
    NOTE: The SAS System stopped processing this step because of errors.


NOTE: Updated analytical products:

SAS/ETS 14.1
SAS/OR 14.1
SAS/IML 14.1
SAS/QC 14.1

NOTE: Additional host information:

IBM AIX AIX 64 1 7 00C030A74C00


Thank you

Re: Error running script in Hadoop (Apache)

There seems to be corrupt/invalid data in the file.

Does the query work outside of SAS? Is the data in text files? Are the text files UTF-8? Can they be UTF8? Can you run a validation?

Re: Error running script in Hadoop (Apache)

thanks for ur reply. I ended running the variable but excluding a variable. The output was good.I didn't look further into this. A colleague later gave me, which i'm listing here in case in can help anyone: 

Pls add this option in libname statement


