SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

SAS HAdoop Pig Code submission Error

Reply
Occasional Contributor
Posts: 17

SAS HAdoop Pig Code submission Error

Hi.

 

I am trying to submit following SAS-Hadoop PIG Code Using SAS 9.4, But I am getting the following error.

Can you please help to fix the issue.

 

To produce the scenario: Here are the steps:

1. Start a Cloudera VM 5.8 and create a bridge Adapter Network and Share Folder.

2. Create a customer config file for SAS-Hadoop Interaction. (Merge All required XML files e.g. core-site.xml)

3. Copy the Cloudera VM Jar files to a local directory accessible by SAS

4. Set up SAS-HADOOP ENV. Variables SAS_HADOOP_JAR_PATH,SAS_CONFIG_PATH, SAS_HADOOP_RESTFUL

5. Create a local txt file containing PIG STATEMENTS

6. Use the below SAS program to generate th scenario

 

filename W2A8SBAK 'C:\Users\abcd\Desktop\pigcommand.txt' ;
data _null_;
file W2A8SBAK;

/* Create a HDFS directory SAS_demo to load the txt file */
put "A = load '/user/cloudera/sas_demo/wordcount.txt'; ";       
put "B = foreach A generate flatten(TOKENIZE((chararray)$0)) as word; ";
put "C = Filter B by (word matches 'SURE);";
put "D = group C by word;";
put "E = foreach D generate COUNT(C), group;";
put "F = store E into '/user/cloudera/pig_theCount';";

run;

 

proc hadoop cfg='C:\Users\abcd\Desktop\sample_sashadoopconfig.xml'
verbose username='cloudera' password='cloudera' ;
pig code=W2A8SBAK WORKINGDIR= '/user/cloudera/';
run;

 

I am getting the following error:

ERROR: An exception has been encountered.
Please contact technical support and provide them with the following traceback information:

The SAS task name is [HADOOP]
ERROR: Read Access Violation HADOOP
Exception occurred at (23D91AED)
Task Traceback
Address Frame (DBGHELP API Version 4.0 rev 5)
0000000023D91AED 0000000027B6F790 0001:0000000000000AED tkepigr.dll
0000000023AF90D0 0000000027B6F798 sashadoo:tkvercn1+0x8090
0000000023AF14F0 0000000027B6F9B0 sashadoo:tkvercn1+0x4B0
0000000004DEC6A7 0000000027B6F9B8 sasxshel:tkvercn1+0x4B667
0000000023AFAED2 0000000027B6FAD0 sashadoo:tkvercn1+0x9E92
0000000023AF1792 0000000027B6FBF0 sashadoo:tkvercn1+0x752
00000000031689DB 0000000027B6FBF8 sashost:Main+0x10EBB
000000000316E62D 0000000027B6FF50 sashost:Main+0x16B0D
00007FFF562F8102 0000000027B6FF58 KERNEL32:BaseThreadInitThunk+0x22
00007FFF56CFC5B4 0000000027B6FF88 ntdll:RtlUserThreadStart+0x34

NOTE: PROCEDURE HADOOP used (Total process time):
real time 0.22 seconds
cpu time 0.14 seconds

 

 

Let me know if you need any other information.

Super User
Posts: 3,108

Re: SAS HAdoop Pig Code submission Error

Have you opened a track with SAS Tech Support on this? I think they would be in the best position to help. 

Occasional Contributor
Posts: 17

Re: SAS HAdoop Pig Code submission Error

Not yet. I am seeking for help as many others have also posted similar error while sas interacts with hadoop using either libname or filestatements or proc hadoop.

 

I am trying to debug why pig commands are not working using proc hadoop.

However, submitting a hdfs command using proc hadoop is working fine.

 

 

Regards,

Ashish Jain

SAS Employee
Posts: 203

Re: SAS HAdoop Pig Code submission Error

[ Edited ]

Hi @ajain59

 

This is most likely a problem with the JAR files. Although, with an Hadoop VM it could be lots of things.

 

It appears that you have SAS_HADOOP_REST=1 set. This means that SAS is using a REST interface instead of JARs for HDFS interaction. The HDFS connectivity may be working because SAS is not using JARs for that interaction.

 

Which release of SAS 9.4 are you using?

 

I haven't used the Cloudera VM in quite a while. If I find some time I will give it a shot. Unfortunately, the steps required to connect to it change with each release of the VM.

 

Best wishes,

Jeff

Occasional Contributor
Posts: 17

Re: SAS HAdoop Pig Code submission Error

Hi,

 

I have tested both scenarios for keeping SAS_HADOOP_RESTFUL =1 and 0.

Does it matter how we copy the jars to local path (shared path) which is accessible by VM and SAS machine?

I  tried to copy jars directly from VM to shared location which is accessible by SAS machine in my local laptop Env.

 

I tried to copy using two methods

sudo - u cp -r /lib <sharedlocation path>

and another way  copy the whole lib directory of VM using Filezilla to shared location path

 

Regards,

Ashish Jain

 

 

SAS Employee
Posts: 203

Re: SAS HAdoop Pig Code submission Error

Hi @ajain59

 

Did you use the hadooptracer.py to gather the JAR files and XML files?

Occasional Contributor
Posts: 17

Re: SAS HAdoop Pig Code submission Error

No. I haven't used hadooptracer.py  for copy the jar files. I am trying to copy jars maually to share location.

Even I am not using SAs deployment manager to configure SAS-Hadoop.

SAS Employee
Posts: 203

Re: SAS HAdoop Pig Code submission Error

Hi @ajain59

 

Chances are there is a JAR file problem (missing or the wrong version of a JAR).

Occasional Contributor
Posts: 17

Re: SAS HAdoop Pig Code submission Error

No problem with JAR files but the error is on configuration part.

 

 

Regards,

Ashish Jain

SAS Employee
Posts: 203

Re: SAS HAdoop Pig Code submission Error

Hi @ajain59

 

If that is the case... have you tried adding the IP address and Hostname to the hosts file on your machine?

Occasional Contributor
Posts: 17

Re: SAS HAdoop Pig Code submission Error

Yes, My host file contains the IP address and host name of the CDH 5.8 VM quickstart.cloudera

 

Also, I have validated if I can ping the CDH 5.8 VM from my local machine and vise versa.

The network has been establised successfully and there is no problem of IP conflicts as well.

 

 

Regards,

Ashish Jain 

Ask a Question
Discussion stats
  • 10 replies
  • 307 views
  • 0 likes
  • 3 in conversation