Native Kerberos (MIT)
This documentation describes how to set up an Alluxio cluster with MIT native Kerberos, running on an AWS EC2 Linux cluster as an example. This doc uses the example of hostname-associated principal setup. If you would like to use unified principal across all the Alluxio service nodes, please set alluxio.security.kerberos.unified.instance.name and make sure the principal instance name matches this configuration property.
The default Java GSS implementation relies on JAAS KerberosLoginModule for initial credential acquisition. In contrast, when native platform Kerberos integration is enabled, the initial credential acquisition should happen prior to calling JGSS APIs, e.g. through kinit. When enabled, Java GSS would look for native GSS library using the operating system specific name, e.g. Solaris: libgss.so vs Linux: libgssapi.so. If the desired GSS library has a different name or is not located under a directory for system libraries, then its full path should be specified using the system property sun.security.jgss.lib.
Prerequisites
- There is an existing MIT KDC (Key Distribution Center). 
- Krb5 client library is installed on the machines running Alluxio servers and clients. 
- ALLUXIO.COMis the example realm name in this doc.
- alluxiois the example Alluxio service name in this doc.
- Each Alluxio service node (Masters and Workers) uses service principal named - alluxio/<HOSTNAME>@ALLUXIO.COM.
- In all Alluxio service nodes, Kerberos credentials for - alluxio/<HOSTNAME>@ALLUXIO.COMalready exist (e.g. through- kinit).
- The service name - alluxiois a valid user in Linux operating system.
- <HOSTNAME>can be set to user-qualified hostname, such as- user.full.machine.host.name
Alluxio Configuration
When installing Alluxio, you can enable Kerberos security for Alluxio by setting up configuration properties in alluxio-site.properties.
alluxio.security.authentication.type=KERBEROS
alluxio.master.hostname=<MASTER_HOSTNAME>You also need to set the worker’s hostname on the worker nodes. Please make sure the WORKER_HOSTNAME set in Alluxio site properties matches with the <HOSTNAME> part of the service principal. Otherwise, Kerberos authentication will fail. You can set the worker hostname in alluxio-site.properties.
alluxio.worker.hostname=<WORKER_HOSTNAME>In Alluxio versions before 2.1.0, you also need to set alluxio.security.kerberos.service.name and this is a required parameter.
alluxio.security.kerberos.service.name=alluxioNote:
- There is an optional property called - alluxio.security.kerberos.unified.instance.name. If specified, all the Alluxio servers will share the same principal. For example, if the unified instance name is set to- alluxio.security.kerberos.unified.instance.name=cluster, then the master and worker principals will be the same, and will be- alluxio/[email protected].
- For Alluxio versions before 2.1.0, - alluxio.security.kerberos.service.nameis a required parameter. This parameter is used to specify the Alluxio Service Principal service name. It is assumed that there is a present Kerberos ticket with the principal- <primary>/<instance>@REALM.COMwhose- <primary>part matches with- alluxio.security.kerberos.service.name. If you have set parameter- alluxio.security.kerberos.server.principal, the- <primary>part must match with- alluxio.security.kerberos.service.name.
- In version 2.1.0, parameter - alluxio.security.kerberos.service.namehas been removed from the configuration.- alluxio.security.kerberos.service.nameis not necessary anymore because it can be extracted from- alluxio.security.kerberos.server.principal.
- In JGSS native environment, - alluxio.security.kerberos.server.principalis an optional parameter. If- alluxio.security.kerberos.server.principalis not set, it will be inferred from the Kerberos ticket cache when Alluxio starts. The server principal is then propagated to clients via If the cluster defaults is disabled by- alluxio.user.conf.cluster.default.enabled=false, the clients must be configured with- alluxio.security.kerberos.server.principalbecause the cluster defaults are not propagated to the clients.
JGSS native Kerberos integration requires an environment variable (i.e. KRB5_KTNAME) to find the keytab file for Alluxio processes. If this environment variable is not set globally, you can add the following line to ${ALLUXIO_HOME}/conf/alluxio-env.sh so that KRB5_KTNAME refers to the keytab file for alluxio/[email protected].
# Replace </path/to/keytab> with the path to the keytab of Alluxio service principal, e.g. alluxio/[email protected]
export KRB5_KTNAME=</path/to/keytab>Furthermore, to use JGSS native Kerberos integration, the following Java system properties must be set. If they are not already set system-wide, you can add them to ALLUXIO_JAVA_OPTS in {ALLUXIO_HOME}/conf/alluxio-env.sh:
ALLUXIO_JAVA_OPTS+=" -Dsun.security.jgss.native=true -Djavax.security.auth.useSubjectCredsOnly=false "To access a Kerberized Alluxio cluster, Alluxio clients require the same configuration.
Using the JGSS native Kerberos implementation requires populating the Kerberos ticket cache before starting Alluxio processes. Therefore, before you start any Alluxio server process (master or worker), you must kinit with the appropriate service principal, and with the OS user starting the Alluxio process. For example, if the Alluxio master will be started by OS user alluxioadmin, the alluxioadmin can start Alluxio by:
$ kinit -kt </path/to/keytab> alluxio/[email protected]
$ ./bin/alluxio-start.sh masterSimilarly, the worker can be started by:
$ kinit -kt </path/to/keytab> alluxio/[email protected]
$ ./bin/alluxio-start.sh worker SudoMountKerberos-enabled Alluxio Integration with Secure-HDFS as UFS
Running Spark with Alluxio Kerberized with native integration
Follow the Running-Spark-on-Alluxio guide to set up SPARK_CLASSPATH. In addition, the following items should be added to make Spark aware of Kerberos configurations:
- Copy Alluxio site configuration - {ALLUXIO_HOME}/conf/alluxio-site.propertiesto- {SPARK_HOME}/conffor Spark to pick up Alluxio configurations such as Kerberos related flags.
- When launching Spark shell or jobs, please add - -Dsun.security.jgss.native=true -Djavax.security.auth.useSubjectCredsOnly=falsein- spark.executor.extraJavaOptionsand- spark.driver.extraJavaOptions
FAQ
When using the native libraries, you can set an environment variable KRB5_TRACE=/tmp/path/to/log. Additionally, set the Kerberos debug level with:
-Dsun.security.krb5.debug=trueLast updated