Hadoop Security : Service Level Authorization
The Hadoop distributed file system (HDFS) provides a way to implement service level authorization mechanism to make sure that any client who wants to connect to the HDFS has the necessary permission to perform the action. This post assume that you have the basic hadoop set up(single node) configured and running. If not so you can read hadoop quick start guide mentioned here.
The benefit of using service level authorization is it is performed much before to other access control checks such as file-permission checks, access control etc. By default the service-level authorization is disabled in hadoop, to enable that we need to set/configure the hadoop.security.authorization to true in ${HADOOP_CONF_DIR}/core-site.xml. It uses a very simple format to define any rule that will be applicable over HDFS. It look similar to the following snippet.
<property> <name>hadoop.security.authorization</name> <value>true</value> <description>Service level authorization params.</description> </property>
Once set to true Hadoop will perform permission check before performing any action. Hadoop defines all his Access Control List in a XML file called hadoop-policy.xml located in the ${HADOOP_CONF_DIR}/hadoop-policy.xml. The following list of services can be found in hadoop-policy.xml
| Property | Value |
| security.client.protocol.acl | ACL for ClientProtocol, which is used by user code via the DistributedFileSystem |
| security.client.datanode.protocol.acl | ACL for ClientDatanodeProtocol, the client-to-datanode protocol for block recovery. |
| security.datanode.protocol.acl | ACL for DatanodeProtocol, which is used by datanodes to communicate with the namenode |
| security.inter.datanode.protocol.acl | ACL for InterDatanodeProtocol, the inter-datanode protocol for updating generation timestamp |
| security.namenode.protocol.acl | ACL for NamenodeProtocol, the protocol used by the secondary namenode to communicate with the namenode |
| security.inter.tracker.protocol.acl | ACL for InterTrackerProtocol, used by the tasktrackers to communicate with the jobtracker |
| security.job.submission.protocol.acl | ACL for JobSubmissionProtocol, used by job clients to communciate with the jobtracker for job submission, querying job status etc |
| security.task.umbilical.protocol.acl | ACL for TaskUmbilicalProtocol, used by the map and reduce tasks to communicate with the parent tasktracker |
| security.refresh.policy.protocol.acl | ACL for RefreshAuthorizationPolicyProtocol, used by the dfsadmin and mradmin commands to refresh the security policy in-effect |
The hadoop-policy.xml also defines the rule in the same manner as mentioned above for the core-site.xml. We can grant access to both users and groups separated by commas. Both lists are separated by a blank space. For example if we want to grant ACL for user xyz and abc along with group hadoop then the configuration should look like following
<property>
<name>security.client.protocol.acl</name>
<value>xyz,abc hadoop</value>
</property>
If you want to grant access for only list of groups then add a blank in the beginning of the line, equivalently a comman-separated list of users followed by a space or nothing implies only a set of given users.
A special value of * implies that all users are allowed to access the service.
<property>
<name>security.client.protocol.acl</name>
<value>*</value>
</property>
After making these changes you should restart the server by executing the following commands
hadoop@dochadoop2:$ /hadoop/bin/stop-all.sh hadoop@dochadoop2:$ /hadoop/bin/start-all.sh
It’s very trouble-free to find out any matter on net as compared to books, as I found this article at this web site.
kolagen
January 29, 2013 at 18:58
This blog post, “Hadoop Security : Service Level Authorization
« Flanzerc” illustrates the fact that you fully understand
everything that you r talking about! I absolutely agree with your post.
Thanks a lot -Rolland
http://google.com
February 13, 2013 at 13:50
Hi! This is my 1st comment here so I just wanted to give a quick shout out and tell you I really enjoy reading through your blog
posts. Can you suggest any other blogs/websites/forums
that deal with the same topics? Thank you!
coffee filters
February 17, 2013 at 21:19
The official hadoop web site provides a pretty decent documentation. You can check it out here
flanzer
February 18, 2013 at 09:25
Hi, yes this paragraph is genuinely good and I have learned lot of things from it concerning blogging.
thanks.
localmallorca.com
March 15, 2013 at 08:22
Pretty nice post. I just stumbled upon your weblog and wanted to
say that I’ve truly enjoyed browsing your blog posts. In any case I will be subscribing to your rss feed and I hope you write again very soon!
www.hcgcomparison.com
April 13, 2013 at 12:36