BookmarkSubscribeRSS Feed
ajain59
Calcite | Level 5

Hi.

 

I am trying to submit following SAS-Hadoop PIG Code Using SAS 9.4, But I am getting the following error.

Can you please help to fix the issue.

 

To produce the scenario: Here are the steps:

1. Start a Cloudera VM 5.8 and create a bridge Adapter Network and Share Folder.

2. Create a customer config file for SAS-Hadoop Interaction. (Merge All required XML files e.g. core-site.xml)

3. Copy the Cloudera VM Jar files to a local directory accessible by SAS

4. Set up SAS-HADOOP ENV. Variables SAS_HADOOP_JAR_PATH,SAS_CONFIG_PATH, SAS_HADOOP_RESTFUL

5. Create a local txt file containing PIG STATEMENTS

6. Use the below SAS program to generate th scenario

 

filename W2A8SBAK 'C:\Users\abcd\Desktop\pigcommand.txt' ;
data _null_;
file W2A8SBAK;

/* Create a HDFS directory SAS_demo to load the txt file */
put "A = load '/user/cloudera/sas_demo/wordcount.txt'; ";       
put "B = foreach A generate flatten(TOKENIZE((chararray)$0)) as word; ";
put "C = Filter B by (word matches 'SURE);";
put "D = group C by word;";
put "E = foreach D generate COUNT(C), group;";
put "F = store E into '/user/cloudera/pig_theCount';";

run;

 

proc hadoop cfg='C:\Users\abcd\Desktop\sample_sashadoopconfig.xml'
verbose username='cloudera' password='cloudera' ;
pig code=W2A8SBAK WORKINGDIR= '/user/cloudera/';
run;

 

I am getting the following error:

ERROR: An exception has been encountered.
Please contact technical support and provide them with the following traceback information:

The SAS task name is [HADOOP]
ERROR: Read Access Violation HADOOP
Exception occurred at (23D91AED)
Task Traceback
Address Frame (DBGHELP API Version 4.0 rev 5)
0000000023D91AED 0000000027B6F790 0001:0000000000000AED tkepigr.dll
0000000023AF90D0 0000000027B6F798 sashadoo:tkvercn1+0x8090
0000000023AF14F0 0000000027B6F9B0 sashadoo:tkvercn1+0x4B0
0000000004DEC6A7 0000000027B6F9B8 sasxshel:tkvercn1+0x4B667
0000000023AFAED2 0000000027B6FAD0 sashadoo:tkvercn1+0x9E92
0000000023AF1792 0000000027B6FBF0 sashadoo:tkvercn1+0x752
00000000031689DB 0000000027B6FBF8 sashost:Main+0x10EBB
000000000316E62D 0000000027B6FF50 sashost:Main+0x16B0D
00007FFF562F8102 0000000027B6FF58 KERNEL32:BaseThreadInitThunk+0x22
00007FFF56CFC5B4 0000000027B6FF88 ntdll:RtlUserThreadStart+0x34

NOTE: PROCEDURE HADOOP used (Total process time):
real time 0.22 seconds
cpu time 0.14 seconds

 

 

Let me know if you need any other information.

10 REPLIES 10
SASKiwi
PROC Star

Have you opened a track with SAS Tech Support on this? I think they would be in the best position to help. 

ajain59
Calcite | Level 5

Not yet. I am seeking for help as many others have also posted similar error while sas interacts with hadoop using either libname or filestatements or proc hadoop.

 

I am trying to debug why pig commands are not working using proc hadoop.

However, submitting a hdfs command using proc hadoop is working fine.

 

 

Regards,

Ashish Jain

JBailey
Barite | Level 11

Hi @ajain59

 

This is most likely a problem with the JAR files. Although, with an Hadoop VM it could be lots of things.

 

It appears that you have SAS_HADOOP_REST=1 set. This means that SAS is using a REST interface instead of JARs for HDFS interaction. The HDFS connectivity may be working because SAS is not using JARs for that interaction.

 

Which release of SAS 9.4 are you using?

 

I haven't used the Cloudera VM in quite a while. If I find some time I will give it a shot. Unfortunately, the steps required to connect to it change with each release of the VM.

 

Best wishes,

Jeff

ajain59
Calcite | Level 5

Hi,

 

I have tested both scenarios for keeping SAS_HADOOP_RESTFUL =1 and 0.

Does it matter how we copy the jars to local path (shared path) which is accessible by VM and SAS machine?

I  tried to copy jars directly from VM to shared location which is accessible by SAS machine in my local laptop Env.

 

I tried to copy using two methods

sudo - u cp -r /lib <sharedlocation path>

and another way  copy the whole lib directory of VM using Filezilla to shared location path

 

Regards,

Ashish Jain

 

 

JBailey
Barite | Level 11

Hi @ajain59

 

Did you use the hadooptracer.py to gather the JAR files and XML files?

ajain59
Calcite | Level 5

No. I haven't used hadooptracer.py  for copy the jar files. I am trying to copy jars maually to share location.

Even I am not using SAs deployment manager to configure SAS-Hadoop.

JBailey
Barite | Level 11

Hi @ajain59

 

Chances are there is a JAR file problem (missing or the wrong version of a JAR).

ajain59
Calcite | Level 5

No problem with JAR files but the error is on configuration part.

 

 

Regards,

Ashish Jain

JBailey
Barite | Level 11

Hi @ajain59

 

If that is the case... have you tried adding the IP address and Hostname to the hosts file on your machine?

ajain59
Calcite | Level 5

Yes, My host file contains the IP address and host name of the CDH 5.8 VM quickstart.cloudera

 

Also, I have validated if I can ping the CDH 5.8 VM from my local machine and vise versa.

The network has been establised successfully and there is no problem of IP conflicts as well.

 

 

Regards,

Ashish Jain 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 1250 views
  • 0 likes
  • 3 in conversation