# CephFS

This guide describes how to configure Alluxio with CephFS as the under storage system. Alluxio supports two different implementations of under storage system for [CephFS](https://docs.ceph.com/en/latest/cephfs/):

* [cephfs](https://docs.ceph.com/en/latest/cephfs/api/libcephfs-java/)
* [cephfs-hadoop](https://docs.ceph.com/en/nautilus/cephfs/hadoop/)

## Prerequisites

### Deploy Alluxio binary package

The Alluxio binaries must be on your machine.

### Install Dependences

According to [ceph packages install](https://docs.ceph.com/en/latest/install/get-packages/) to install below packages:

```
cephfs-java
libcephfs_jni
libcephfs2
```

### Make symbolic links

```console
$ ln -s /usr/lib64/libcephfs_jni.so.1.0.0 /usr/lib64/libcephfs_jni.so
$ ln -s /usr/lib64/libcephfs.so.2.0.0 /usr/lib64/libcephfs.so
$ java_path=`which java | xargs readlink | sed 's#/bin/java##g'`
$ ln -s /usr/share/java/libcephfs.jar $java_path/jre/lib/ext/libcephfs.jar
```

### Download CephFS Hadoop jar

```console
$ curl -o $java_path/jre/lib/ext/hadoop-cephfs.jar -s https://download.ceph.com/tarballs/hadoop-cephfs.jar
```

## Basic Setup

Configure Alluxio to use under storage systems by modifying `conf/alluxio-site.properties` and `conf/core-site.xml`. If them do not exist, create the configuration files from the templates

```console
$ cp conf/alluxio-site.properties.template conf/alluxio-site.properties
$ cp conf/core-site.xml.template conf/core-site.xml
```

<details>

<summary>cephfs</summary>

Modify `conf/alluxio-site.properties` to include:

```properties
alluxio.underfs.cephfs.conf.file=<ceph-conf-file>
alluxio.underfs.cephfs.mds.namespace=<ceph-fs-name>
alluxio.underfs.cephfs.mount.point=<ceph-fs-dir>
alluxio.underfs.cephfs.auth.id=<client-id>
alluxio.underfs.cephfs.auth.keyring=<client-keyring-file>
```

</details>

<details>

<summary>cephfs-hadoop</summary>

Modify `conf/alluxio-site.properties` to include:

```properties
alluxio.underfs.hdfs.configuration=${ALLUXIO_HOME}/conf/core-site.xml
```

Modify `conf/core-site.xml` to include:

```xml
<configuration>
  <property>
    <name>fs.default.name</name>
    <value>ceph://mon1,mon2,mon3/</value>
  </property>
  <property>
    <name>fs.defaultFS</name>
    <value>ceph://mon1,mon2,mon3/</value>
  </property>
  <property>
    <name>ceph.data.pools</name>
    <value>${data-pools}</value>
  </property>
  <property>
    <name>ceph.auth.id</name>
    <value>${client-id}</value>
  </property>
  <property>
    <name>ceph.conf.options</name>
    <value>client_mount_gid=${gid},client_mount_uid=${uid},client_mds_namespace=${ceph-fs-name}</value>
  </property>
  <property>
    <name>ceph.root.dir</name>
    <value>${ceph-fs-dir}</value>
  </property>
  <property>
    <name>ceph.mon.address</name>
    <value>mon1,mon2,mon3</value>
  </property>
  <property>
    <name>fs.AbstractFileSystem.ceph.impl</name>
    <value>org.apache.hadoop.fs.ceph.CephFs</value>
  </property>
  <property>
    <name>fs.ceph.impl</name>
    <value>org.apache.hadoop.fs.ceph.CephFileSystem</value>
  </property>
  <property>
    <name>ceph.auth.keyring</name>
    <value>${client-keyring-file}</value>
  </property>
</configuration>
```

</details>

## Running Alluxio Locally with CephFS

Start up Alluxio locally to see that everything works.

```console
$ ./bin/alluxio format
$ ./bin/alluxio-start.sh local
```

This should start an Alluxio master and Alluxio worker. You can see the master UI at <http://localhost:19999>.

<details>

<summary>cephfs</summary>

An CephFS location can be mounted at a nested directory in the Alluxio namespace to have unified access to multiple under storage systems. Alluxio's [Command Line Interface](/ee-da-en/da-2.10/operations/user-cli.md) can be used for this purpose.

Issue the following command to use the ufs cephfs:

```
$ ./bin/alluxio fs mkdir /mnt/cephfs
$ ./bin/alluxio fs mount /mnt/cephfs cephfs://mon1\;mon2\;mon3/
```

Run a simple example program:

```console
$ ./bin/alluxio runTests --path cephfs://mon1\;mon2\;mon3/
```

Visit your cephfs to verify the files and directories created by Alluxio exist.

You should see files named like: In cephfs, you can visit cephfs with ceph-fuse or mount by POSIX APIs. [Mounting CephFS](https://docs.ceph.com/en/latest/cephfs/#mounting-cephfs)

```
${ceph-fs-dir}/default_tests_files/Basic_CACHE_THROUGH
```

In Alluxio, you can visit the nested directory in the Alluxio. Alluxio's [Command Line Interface](/ee-da-en/da-2.10/operations/user-cli.md) can be used for this purpose.

```
/mnt/cephfs/default_tests_files/Basic_CACHE_THROUGH
```

</details>

<details>

<summary>cephfs-hadoop</summary>

An CephFS location can be mounted at a nested directory in the Alluxio namespace to have unified access to multiple under storage systems. Alluxio's [Command Line Interface](/ee-da-en/da-2.10/operations/user-cli.md) can be used for this purpose.

Issue the following command to use the ufs cephfs:

```console
$ ./bin/alluxio fs mkdir /mnt/cephfs-hadoop
$ ./bin/alluxio fs mount /mnt/cephfs-hadoop ceph://mon1\;mon2\;mon3/
```

Run a simple example program:

```console
./bin/alluxio runTests --path cephfs://mon1\;mon2\;mon3/
```

Visit your cephfs to verify the files and directories created by Alluxio exist.

You should see files named like: In cephfs, you can visit cephfs with ceph-fuse or mount by POSIX APIs. [Mounting CephFS](https://docs.ceph.com/en/latest/cephfs/#mounting-cephfs)

```
${ceph-fs-dir}/default_tests_files/Basic_CACHE_THROUGH
```

In Alluxio, you can visit the nested directory in the Alluxio. Alluxio's [Command Line Interface](/ee-da-en/da-2.10/operations/user-cli.md) can be used for this purpose.

```
/mnt/cephfs-hadoop/default_tests_files/Basic_CACHE_THROUGH
```

</details>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.alluxio.io/ee-da-en/da-2.10/storage-integrations/cephfs.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
