Architecting, installing and maintaining your SAS environment

Load job for HDFS is throwing error when run by backup-admin

Accepted Solution Solved
Reply
Contributor
Posts: 73
Accepted Solution

Load job for HDFS is throwing error when run by backup-admin

I have several jobs setup that run nightly which loads data into HDFS (another set of jobs follow that will load from HDFS to LASR).  Everything works fine for me...and if something occassionally fails, I can re-run the jobs without any issues.  The problem comes in when my back-up admin needs to re-run one of these jobs.  When she attempts to re-run one of the HDFS load jobs (due to data changes, we re-load the entire table), she receives the following error:

 

ERROR:oursasserver.name (xxx.xxx.xxx.xxx)

       setting: user=usernamehere, inode=tablename.sashdat

       org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkStickyBit(FSPermissionChecker.java:366)

       org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:173)

       org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5497)

ERROR: The HDFS concatenation of the file blocks failed.

ERROR: An error occurred during the transmission of data to file '/vapublic/tablename' on the namenode.

 

I know why she receivs this error...it is due to the fact that she has "read only" permissions to the file in /vapublic.  The permissions show up as follows:

 

Permission  Owner      Group           ...Name

-rw-r--r--       rgreen33  supergroup     tablename

 

So, since I am the owner and I am the only one with write access, the job fails when it attempts to delete/recreate the file.

 

So, what is the proper way to get around this?  I have thought of the following...but, not sure which is the correct way.

 

  1. Add all SAS admins to the supergroup and elevate the permission of the supergroup for the files in the /vapublic directory.
  2. Create a SAS account (share credentials with other SAS admins) and setup all jobs to run as this "special" account.

Ideas/suggestions?

 

Thanks,

Ricky

 

 


Accepted Solutions
Solution
‎06-23-2017 07:51 AM
Contributor
Posts: 73

Re: Load job for HDFS is throwing error when run by backup-admin

Posted in reply to JuanS_OCS

I just wanted to follow-up on this posting with the solution that we found.  Essentially, we had a few issues.  The real issue was caused by the fact that our hdfs-site.xml had the following:

 

<property>

<name>dfs.permissions.superusergroup</name>

<value>supergroup</value>

</property>

<property>

<name>fs.permissions.umask-mode</name>

<value>022</value>

</property>

 

To fix the issue, we did the following:

 

  1. Created a new group in HDFS (adding our hadoop admins to the group - everyone that will be creating files in hadoop).
  2. Modified the above lines in hdfs-site.xml file to the following:

    <property>

    <name>dfs.permissions.superusergroup</name>

    <value>newgroup_here</value>

    </property>

    <property>

    <name>fs.permissions.umask-mode</name>

    <value>002</value>

    </property>

  3. Ran the following commands on /vapublic

    ./hadoop fs -chmod -t /vapublic

    ./hadoop fs -chmod -R 664 /vapublic

    ./hadoop fs -chgrp -R hadoopadmin /vapublic

 

Once the above steps were completed, we did a restart on hdfs and everything worked as expected.

 

Kudos to Blake with SAS Tech Support for helping us find/fix this issue.

 

Thanks,

Ricky

 

 

View solution in original post


All Replies
Contributor
Posts: 73

Re: Load job for HDFS is throwing error when run by backup-admin

One additional piece of information...

If I look at the parent folder "vapublic", I see that the permissions are set as follows:

Permission Owner Group
drwxrwxrwt hdfs supergroup

So, I suppose another option would be to drop the sticky bit from the vapublic folder. But, what problems would this cause? Or, what holes could this open up?

Thanks,
Ricky
Trusted Advisor
Posts: 1,307

Re: Load job for HDFS is throwing error when run by backup-admin

Hello @rgreen33, Ricky,

 

i would just make it fit to your "company traditions", on the way that you better minimize the exceptions during maintenance and procedures.

 

Generally speaking, I would just go for the first option. PROs: you can always trace back who did what. CONs: maybe you will grant too many permissions.

 

Also, if you go for the second option, you will just get the same CONs.

Solution
‎06-23-2017 07:51 AM
Contributor
Posts: 73

Re: Load job for HDFS is throwing error when run by backup-admin

Posted in reply to JuanS_OCS

I just wanted to follow-up on this posting with the solution that we found.  Essentially, we had a few issues.  The real issue was caused by the fact that our hdfs-site.xml had the following:

 

<property>

<name>dfs.permissions.superusergroup</name>

<value>supergroup</value>

</property>

<property>

<name>fs.permissions.umask-mode</name>

<value>022</value>

</property>

 

To fix the issue, we did the following:

 

  1. Created a new group in HDFS (adding our hadoop admins to the group - everyone that will be creating files in hadoop).
  2. Modified the above lines in hdfs-site.xml file to the following:

    <property>

    <name>dfs.permissions.superusergroup</name>

    <value>newgroup_here</value>

    </property>

    <property>

    <name>fs.permissions.umask-mode</name>

    <value>002</value>

    </property>

  3. Ran the following commands on /vapublic

    ./hadoop fs -chmod -t /vapublic

    ./hadoop fs -chmod -R 664 /vapublic

    ./hadoop fs -chgrp -R hadoopadmin /vapublic

 

Once the above steps were completed, we did a restart on hdfs and everything worked as expected.

 

Kudos to Blake with SAS Tech Support for helping us find/fix this issue.

 

Thanks,

Ricky

 

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 259 views
  • 0 likes
  • 2 in conversation