Architecting, installing and maintaining your SAS environment

SAS Hadoop HDP Sandbox issues

Reply
New Contributor
Posts: 2

SAS Hadoop HDP Sandbox issues

Hi

I am trying to validate the SAS/Access to Hadoop configuration, and in general trying to get things working. I'm not very experienced, and error messages finally got cryptic enough that I can't figure them out by myself. I tried using the SAS Deployment Manager to collect the JARs and XML files, but there was no options relating to hadoop when running it. So came here for help. I read the other threads about Hadoop issues, but nothing there worked for me, so posting a new thread in hopes someone can help me out. 

Our setup: SAS 9.4M3 is installed on Win7 VM running on a Mac OS El Capitan host. The same physical host is running HDP 2.3 Hortonworks Sandbox VM. 

Here's the log file for SAS/Access method

 

73   option set=SAS_HADOOP_CONFIG_PATH="C:\hadoop\conf";

74   option set=SAS_HADOOP_JAR_PATH="C:\hadoop\lib";

75

76

77   libname hdplib hadoop server="sandbox.hortonworks.com" user="root" password=XXXXXXXX

77 ! port=10000 subprotocol=hive2;

NOTE: Libref HDPLIB was successfully assigned as follows:

      Engine:        HADOOP

      Physical Name: jdbc:hive2://sandbox.hortonworks.com:10000/default

78

79   data hdplib.class1 ;

80   set sashelp.class(obs=10);

81   run;

 

ERROR: java.lang.NoClassDefFoundError: org/apache/htrace/SamplerBuilder

NOTE: Validate the contents of the Hadoop configuration file and ensure user permissions are

      correct.

ERROR: Unable to create stream service from /tmp/sasdata-2015-11-20-15-34-07-804-e-00001.dlv.

       Use the debug option for more information.

ERROR: Unable to create stream service from /tmp/sasdata-2015-11-20-15-34-07-804-e-00001.dlv.

       Use the debug option for more information.

NOTE: The DATA step has been abnormally terminated.

NOTE: The SAS System stopped processing this step because of errors.

NOTE: There were 1 observations read from the data set SASHELP.CLASS.

WARNING: The data set HDPLIB.CLASS1 may be incomplete.  When this step was stopped there were 0

         observations and 5 variables.

ERROR: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED:

       SemanticException Line 1:17 Invalid path

       ''/tmp/sasdata-2015-11-20-15-34-07-804-e-00001.dlv'': No files matching path

       hdfs://sandbox.hortonworks.com:8020/tmp/sasdata-2015-11-20-15-34-07-804-e-00001.dlv

ERROR: Unable to execute Hadoop query.

ERROR: Execute error.

ERROR: Error moving data from Hadoop to Hive (LOAD DATA failed).

NOTE: DATA statement used (Total process time):

      real time           5.81 seconds

      cpu time            0.15 seconds

 

82

83

84   data work.geoloc (where= (event ^= "normal"));

85   set hdplib.geolocation (firstobs = 2);

86   run;

 

NOTE: There were 8011 observations read from the data set HDPLIB.GEOLOCATION.

WARNING: SAS/ACCESS assigned these columns a length of 32767. If resulting SAS character

         variables remain this length, SAS performance is impacted. See SAS/ACCESS

         documentation for details. Columns followed by the maximum length observed were:

         truckid:7, driverid:11, event:25, city:15, state:10

NOTE: The data set WORK.GEOLOC has 458 observations and 10 variables.

NOTE: DATA statement used (Total process time):

      real time           14.35 seconds

      cpu time            8.43 seconds

 

The 2nd data step is just to confirm that SAS can access those files. It gets all messed up, and put in wrong columns if I don't skip the header though, not sure at all why is that. 

 

Here's the log file for proc hadoop and filename hadoop methods: 

108

109

110  options set=SAS_HADOOP_CONFIG_PATH="C:\hadoop\conf";

111  options set=SAS_HADOOP_JAR_PATH="C:\hadoop\lib";

112

113  proc hadoop username='root' password=XXXXXXXX verbose cfg="C:\hadoop\conf\core-site.xml"

114       ;

115     hdfs mkdir='/user/guest/new_directory';

ERROR: java.lang.NoClassDefFoundError: org/apache/htrace/SamplerBuilder

ERROR:  at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:635)

ERROR:  at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)

ERROR:  at

       org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)

ERROR:  at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)

ERROR:  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)

ERROR:  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:170)

ERROR:  at com.dataflux.hadoop.DFConfiguration.getFileSystem(DFConfiguration.java:458)

ERROR:  at com.dataflux.hadoop.DFHDFS$16.run(DFHDFS.java:483)

ERROR:  at com.dataflux.hadoop.DFHDFS$16.run(DFHDFS.java:479)

ERROR:  at java.security.AccessController.doPrivileged(Native Method)

ERROR:  at javax.security.auth.Subject.doAs(Subject.java:415)

ERROR:  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)

ERROR:  at com.dataflux.hadoop.DFHDFS.mkdir(DFHDFS.java:479)

116     hdfs copytolocal='/user/guest/Batting.csv'

117          out='C:\hadoop\test.csv' overwrite;

118  run;

NOTE: The SAS System stopped processing this step because of errors.

NOTE: PROCEDURE HADOOP used (Total process time):

      real time           0.17 seconds

      cpu time            0.03 seconds

119

120

121   filename out hadoop "/tmp/sas-test-file" dir

122    user="root" pass=XXXXXXXX;

 

123    data _null_;

124    file out;

125    put "here is a line in myfile";

126    run;

 

ERROR: java.lang.NullPointerException

ERROR:  at java.lang.ProcessBuilder.start(ProcessBuilder.java:1010)

ERROR:  at org.apache.hadoop.util.Shell.runCommand(Shell.java:483)

ERROR:  at org.apache.hadoop.util.Shell.run(Shell.java:456)

ERROR:  at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)

ERROR:  at org.apache.hadoop.util.Shell.execCommand(Shell.java:815)

ERROR:  at org.apache.hadoop.util.Shell.execCommand(Shell.java:798)

ERROR:  at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:728)

ERROR:  at

       org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem

       .java:225)

ERROR:  at

       org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem

       .java:209)

ERROR:  at

       org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.jav

       a:305)

ERROR:  at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:293)

ERROR:  at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:326)

ERROR:  at

       org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.

       java:393)

ERROR:  at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:456)

ERROR:  at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435)

ERROR:  at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)

ERROR:  at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)

ERROR:  at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:852)

ERROR:  at com.dataflux.hadoop.DFHDFS$6.run(DFHDFS.java:226)

ERROR:  at com.dataflux.hadoop.DFHDFS$6.run(DFHDFS.java:222)

ERROR:  at java.security.AccessController.doPrivileged(Native Method)

ERROR:  at javax.security.auth.Subject.doAs(Subject.java:415)

ERROR:  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)

ERROR:  at com.dataflux.hadoop.DFHDFS.createFile(DFHDFS.java:222)

NOTE: Validate the contents of the Hadoop configuration file and ensure user permissions are

      correct.

ERROR: Unable to create stream service from /tmp/sas-test-file/. Use the debug option for more

       information.

NOTE: The SAS System stopped processing this step because of errors.

NOTE: DATA statement used (Total process time):

      real time           0.12 seconds

      cpu time            0.07 seconds

 

The picklist output is: 

 

87   proc javainfo picklist 'hadoop/hdoopsasjars.txt';

88   run;

 

Picklist URLs:

file:/C:/Program%20Files/SASHome/SASVersionedJarRepository/eclipse/plugins/sas.hadoop.hivehelper

_904101.1.0.20140828120000_v940m1f/sas.hadoop.hivehelper.jar

file:/C:/Program%20Files/SASHome/SASVersionedJarRepository/eclipse/plugins/Log4J_1.2.15.0_SAS_20

121211183158/log4j.jar

file:/C:/Program%20Files/SASHome/SASVersionedJarRepository/eclipse/plugins/commons_beanutils_1.8

.2.0_SAS_20121211183319/commons-beanutils.jar

file:/C:/Program%20Files/SASHome/SASVersionedJarRepository/eclipse/plugins/commons_collections_3

.2.1.0_SAS_20121211183225/commons-collections.jar

file:/C:/Program%20Files/SASHome/SASVersionedJarRepository/eclipse/plugins/commons_logging_1.1.1

.0_SAS_20121211183202/commons-logging.jar

file:/C:/Program%20Files/SASHome/SASVersionedJarRepository/eclipse/plugins/jackson_1.9.7.0_SAS_2

0121211183158/jackson.jar

file:/C:/Program%20Files/SASHome/SASVersionedJarRepository/eclipse/plugins/slf4j_1.5.10.0_SAS_20

121211183229/slf4j-api.jar

file:/C:/Program%20Files/SASHome/SASVersionedJarRepository/eclipse/plugins/slf4j_1.5.10.0_SAS_20

121211183229/slf4j-log4j12.jar

Total URLs: 8

 

The SAS_HADOOP_JAR_PATH is C:\hadoop\lib on the Win7 machine. 

Jars I have there are: 

automaton-1.11-8

guava-11.0.2

hadoop-auth-2.7.1.2.3.0.0-2557

hadoop-common-2.7.1.2.3.0.0-2557

hadoop-hdfs-2.7.1.2.3.0.0-2557

hive-exec-1.2.1.2.3.0.0-2557

hive-jdbc-1.2.1.2.3.0.0-2557

hive-metastore-1.2.1.2.3.0.0-2557

hive-service-1.2.1.2.3.0.0-2557

httpclient-4.4

httpcore-4.4

jline-2.12

libfb303-0.9.2

pig-0.15.0.2.3.0.0-2557-withouthadoop-h2

protobuf-java-2.5.0

 

SAS_HADOOP_CONF_PATH is set to C:\hadoop\conf and contains core-site.xml, hdfs-site.xml, hive-site.xml, mapred-site.xml and yarn-site.xml

 

I have a gut feeling there's something wrong with my JAR files and/or the path to them, and maybe the root user having apparently not root access on the sandbox, but I am honestly not sure. 

Ask a Question
Discussion stats
  • 0 replies
  • 866 views
  • 0 likes
  • 1 in conversation