# Native Kerberos (MIT)

This documentation describes how to set up an Alluxio cluster with MIT native [Kerberos](http://web.mit.edu/kerberos/), running on an AWS EC2 Linux cluster as an example. This doc uses the example of hostname-associated principal setup. If you would like to use unified principal across all the Alluxio service nodes, please set `alluxio.security.kerberos.unified.instance.name` and make sure the principal instance name matches this configuration property.

The default Java GSS implementation relies on JAAS KerberosLoginModule for initial credential acquisition. In contrast, when native platform Kerberos integration is enabled, the initial credential acquisition should happen prior to calling JGSS APIs, e.g. through `kinit`. When enabled, Java GSS would look for native GSS library using the operating system specific name, e.g. Solaris: `libgss.so` vs Linux: `libgssapi.so`. If the desired GSS library has a different name or is not located under a directory for system libraries, then its full path should be specified using the system property `sun.security.jgss.lib`.

## Prerequisites

* There is an existing MIT KDC (Key Distribution Center).
* Krb5 client library is installed on the machines running Alluxio servers and clients.
* `ALLUXIO.COM` is the example realm name in this doc.
* `alluxio` is the example Alluxio service name in this doc.
* Each Alluxio service node (Masters and Workers) uses service principal named `alluxio/<HOSTNAME>@ALLUXIO.COM`.
* In all Alluxio service nodes, Kerberos credentials for `alluxio/<HOSTNAME>@ALLUXIO.COM` already exist (e.g. through `kinit`).
* The service name `alluxio` is a valid user in Linux operating system.
* `<HOSTNAME>` can be set to user-qualified hostname, such as `user.full.machine.host.name`

## Alluxio Configuration

When installing Alluxio, you can enable Kerberos security for Alluxio by setting up configuration properties in `alluxio-site.properties`.

```properties
alluxio.security.authentication.type=KERBEROS
alluxio.master.hostname=<MASTER_HOSTNAME>
```

You also need to set the worker’s hostname on the worker nodes. Please make sure the `WORKER_HOSTNAME` set in Alluxio site properties matches with the `<HOSTNAME>` part of the service principal. Otherwise, Kerberos authentication will fail. You can set the worker hostname in `alluxio-site.properties`.

```properties
alluxio.worker.hostname=<WORKER_HOSTNAME>
```

In Alluxio versions before 2.1.0, you also need to set `alluxio.security.kerberos.service.name` and this is a required parameter.

```properties
alluxio.security.kerberos.service.name=alluxio
```

Note:

* There is an optional property called `alluxio.security.kerberos.unified.instance.name`. If specified, all the Alluxio servers will share the same principal. For example, if the unified instance name is set to `alluxio.security.kerberos.unified.instance.name=cluster`, then the master and worker principals will be the same, and will be `alluxio/cluster@ALLUXIO.COM`.
* For Alluxio versions before 2.1.0, `alluxio.security.kerberos.service.name` is a required parameter. This parameter is used to specify the Alluxio Service Principal service name. It is assumed that there is a present Kerberos ticket with the principal `<primary>/<instance>@REALM.COM` whose `<primary>` part matches with `alluxio.security.kerberos.service.name`. If you have set parameter `alluxio.security.kerberos.server.principal`, the `<primary>` part must match with `alluxio.security.kerberos.service.name`.
* In version 2.1.0, parameter `alluxio.security.kerberos.service.name` has been removed from the configuration. `alluxio.security.kerberos.service.name` is not necessary anymore because it can be extracted from `alluxio.security.kerberos.server.principal`.
* In JGSS native environment, `alluxio.security.kerberos.server.principal` is an optional parameter. If `alluxio.security.kerberos.server.principal` is not set, it will be inferred from the Kerberos ticket cache when Alluxio starts. The server principal is then propagated to clients via If the cluster defaults is disabled by `alluxio.user.conf.cluster.default.enabled=false`, the clients must be configured with `alluxio.security.kerberos.server.principal` because the cluster defaults are not propagated to the clients.

JGSS native Kerberos integration requires an environment variable (i.e. `KRB5_KTNAME`) to find the keytab file for Alluxio processes. If this environment variable is not set globally, you can add the following line to `${ALLUXIO_HOME}/conf/alluxio-env.sh` so that `KRB5_KTNAME` refers to the keytab file for `alluxio/localhost@ALLUXIO.COM`.

```bash
# Replace </path/to/keytab> with the path to the keytab of Alluxio service principal, e.g. alluxio/localhost@ALLUXIO.COM
export KRB5_KTNAME=</path/to/keytab>
```

Furthermore, to use JGSS native Kerberos integration, the following Java system properties must be set. If they are not already set system-wide, you can add them to `ALLUXIO_JAVA_OPTS` in `{ALLUXIO_HOME}/conf/alluxio-env.sh`:

```bash
ALLUXIO_JAVA_OPTS+=" -Dsun.security.jgss.native=true -Djavax.security.auth.useSubjectCredsOnly=false "
```

To access a Kerberized Alluxio cluster, Alluxio clients require the same configuration.

Using the JGSS native Kerberos implementation requires populating the Kerberos ticket cache before starting Alluxio processes. Therefore, before you start any Alluxio server process (master or worker), you must `kinit` with the appropriate service principal, and with the OS user starting the Alluxio process. For example, if the Alluxio master will be started by OS user `alluxioadmin`, the `alluxioadmin` can start Alluxio by:

```console
$ kinit -kt </path/to/keytab> alluxio/localhost@ALLUXIO.COM
$ ./bin/alluxio-start.sh master
```

Similarly, the worker can be started by:

```console
$ kinit -kt </path/to/keytab> alluxio/localhost@ALLUXIO.COM
$ ./bin/alluxio-start.sh worker SudoMount
```

## Kerberos-enabled Alluxio Integration with Secure-HDFS as UFS

## Running Spark with Alluxio Kerberized with native integration

Follow the [Running-Spark-on-Alluxio](/ee-da-en/da-2.7/compute-integrations/spark.md) guide to set up `SPARK_CLASSPATH`. In addition, the following items should be added to make Spark aware of Kerberos configurations:

* Copy Alluxio site configuration `{ALLUXIO_HOME}/conf/alluxio-site.properties` to `{SPARK_HOME}/conf` for Spark to pick up Alluxio configurations such as Kerberos related flags.
* When launching Spark shell or jobs, please add `-Dsun.security.jgss.native=true -Djavax.security.auth.useSubjectCredsOnly=false` in `spark.executor.extraJavaOptions` and `spark.driver.extraJavaOptions`

## FAQ

When using the native libraries, you can set an environment variable `KRB5_TRACE=/tmp/path/to/log`. Additionally, set the Kerberos debug level with:

```properties
-Dsun.security.krb5.debug=true
```

<details>

<summary>Cannot add private credential to subject with JGSS</summary>

This is typically because the required pre-existing Kerberos credential is not valid. Please run \`klist\` to double check.

</details>

<details>

<summary>Unable to Obtain Password from User</summary>

You will see this only if the JGSS system property is not setup correctly. Alluxio falls back to the JAAS Kerberos login with ticket cache and keytab file if \`sun.security.jgss.native\` is not enabled.

</details>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.alluxio.io/ee-da-en/da-2.7/operations/native-kerberos-security-setup.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
