Alluxio
ProductsLanguageHome
  • Introduction
  • Overview
    • Architecture
    • Job Service
    • Quick Start Guide
    • FAQ
    • Use Cases
  • Core Services
    • Caching
    • Unified Namespace
  • Install Alluxio
    • Local Machine
    • Cluster
    • Cluster with HA
    • Docker
    • Software Requirements
  • Kubernetes
    • Deploy
    • Spark on Kubernetes
    • Metrics
  • Cloud Native
    • Alibaba Cloud ACK
    • AWS EMR
    • Tencent EMR
    • Google Dataproc
  • Compute Integration
    • Apache Spark
    • Apache Hadoop MapReduce
    • Apache Flink
    • Apache Hive
    • Presto on Iceberg (Experimental)
    • Presto
    • Trino
    • Tensorflow
  • Storage Integrations
    • Amazon AWS S3
    • HDFS
    • Azure Blob Store
    • Azure Data Lake Storage Gen2
    • Azure Data Lake Storage
    • Google Cloud Storage
    • Qiniu Kodo
    • COSN
    • CephObjectStorage
    • MinIO
    • NFS
    • Aliyun Object Storage Service
    • Ozone
    • Swift
    • WEB
    • CephFS
  • Security
  • Operations
    • Configuration Settings
    • User CLI
    • Admin CLI
    • Web UI
    • Journal Management
    • Metastore Management
    • Metrics
  • Administration
    • Troubleshooting
    • Basic Logging
    • Remote Logging
    • Performance Tuning
    • Scalability Tuning
    • StressBench (Experimental)
    • Upgrading
  • Solutions
  • Client APIs
    • Java API
    • S3 API
    • REST API
    • POSIX API
  • Contributor Resources
    • Building Alluxio From Source
    • Contribution Guide
    • Code Conventions
    • Documentation Conventions
    • Contributor Tools
  • Reference
    • List Of Configuration Properties
    • List of Metrics
  • REST API
    • Master REST API
    • Worker REST API
    • Proxy REST API
    • Job REST API
  • Javadoc
Powered by GitBook
On this page
  • Prerequisites
  • Deploy Alluxio binary package
  • Install Dependences
  • Make symbolic links
  • Download CephFS Hadoop jar
  • Basic Setup
  • Running Alluxio Locally with CephFS
  1. Storage Integrations

CephFS

Last updated 6 months ago

This guide describes how to configure Alluxio with CephFS as the under storage system. Alluxio supports two different implementations of under storage system for :

Prerequisites

Deploy Alluxio binary package

The Alluxio binaries must be on your machine. You can either , or .

Install Dependences

According to to install below packages:

cephfs-java
libcephfs_jni
libcephfs2

Make symbolic links

$ ln -s /usr/lib64/libcephfs_jni.so.1.0.0 /usr/lib64/libcephfs_jni.so
$ ln -s /usr/lib64/libcephfs.so.2.0.0 /usr/lib64/libcephfs.so
$ java_path=`which java | xargs readlink | sed 's#/bin/java##g'`
$ ln -s /usr/share/java/libcephfs.jar $java_path/jre/lib/ext/libcephfs.jar

Download CephFS Hadoop jar

$ curl -o $java_path/jre/lib/ext/hadoop-cephfs.jar -s https://download.ceph.com/tarballs/hadoop-cephfs.jar

Basic Setup

Configure Alluxio to use under storage systems by modifying conf/alluxio-site.properties and conf/core-site.xml. If them do not exist, create the configuration files from the templates

$ cp conf/alluxio-site.properties.template conf/alluxio-site.properties
$ cp conf/core-site.xml.template conf/core-site.xml
cephfs

Modify conf/alluxio-site.properties to include:

alluxio.underfs.cephfs.conf.file=<ceph-conf-file>
alluxio.underfs.cephfs.mds.namespace=<ceph-fs-name>
alluxio.underfs.cephfs.mount.point=<ceph-fs-dir>
alluxio.underfs.cephfs.auth.id=<client-id>
alluxio.underfs.cephfs.auth.keyring=<client-keyring-file>
cephfs-hadoop

Modify conf/alluxio-site.properties to include:

alluxio.underfs.hdfs.configuration=${ALLUXIO_HOME}/conf/core-site.xml

Modify conf/core-site.xml to include:

<configuration>
  <property>
    <name>fs.default.name</name>
    <value>ceph://mon1,mon2,mon3/</value>
  </property>
  <property>
    <name>fs.defaultFS</name>
    <value>ceph://mon1,mon2,mon3/</value>
  </property>
  <property>
    <name>ceph.data.pools</name>
    <value>${data-pools}</value>
  </property>
  <property>
    <name>ceph.auth.id</name>
    <value>${client-id}</value>
  </property>
  <property>
    <name>ceph.conf.options</name>
    <value>client_mount_gid=${gid},client_mount_uid=${uid},client_mds_namespace=${ceph-fs-name}</value>
  </property>
  <property>
    <name>ceph.root.dir</name>
    <value>${ceph-fs-dir}</value>
  </property>
  <property>
    <name>ceph.mon.address</name>
    <value>mon1,mon2,mon3</value>
  </property>
  <property>
    <name>fs.AbstractFileSystem.ceph.impl</name>
    <value>org.apache.hadoop.fs.ceph.CephFs</value>
  </property>
  <property>
    <name>fs.ceph.impl</name>
    <value>org.apache.hadoop.fs.ceph.CephFileSystem</value>
  </property>
  <property>
    <name>ceph.auth.keyring</name>
    <value>${client-keyring-file}</value>
  </property>
</configuration>

Running Alluxio Locally with CephFS

Start up Alluxio locally to see that everything works.

$ ./bin/alluxio format
$ ./bin/alluxio-start.sh local
cephfs

Issue the following command to use the ufs cephfs:

$ ./bin/alluxio fs mkdir /mnt/cephfs
$ ./bin/alluxio fs mount /mnt/cephfs cephfs://mon1\;mon2\;mon3/

Run a simple example program:

$ ./bin/alluxio runTests --path cephfs://mon1\;mon2\;mon3/

Visit your cephfs to verify the files and directories created by Alluxio exist.

${ceph-fs-dir}/default_tests_files/Basic_CACHE_THROUGH
/mnt/cephfs/default_tests_files/Basic_CACHE_THROUGH
cephfs-hadoop

Issue the following command to use the ufs cephfs:

$ ./bin/alluxio fs mkdir /mnt/cephfs-hadoop
$ ./bin/alluxio fs mount /mnt/cephfs-hadoop ceph://mon1\;mon2\;mon3/

Run a simple example program:

./bin/alluxio runTests --path cephfs://mon1\;mon2\;mon3/

Visit your cephfs to verify the files and directories created by Alluxio exist.

${ceph-fs-dir}/default_tests_files/Basic_CACHE_THROUGH
/mnt/cephfs-hadoop/default_tests_files/Basic_CACHE_THROUGH

This should start an Alluxio master and Alluxio worker. You can see the master UI at .

An CephFS location can be mounted at a nested directory in the Alluxio namespace to have unified access to multiple under storage systems. Alluxio's can be used for this purpose.

You should see files named like: In cephfs, you can visit cephfs with ceph-fuse or mount by POSIX APIs.

In Alluxio, you can visit the nested directory in the Alluxio. Alluxio's can be used for this purpose.

An CephFS location can be mounted at a nested directory in the Alluxio namespace to have unified access to multiple under storage systems. Alluxio's can be used for this purpose.

You should see files named like: In cephfs, you can visit cephfs with ceph-fuse or mount by POSIX APIs.

In Alluxio, you can visit the nested directory in the Alluxio. Alluxio's can be used for this purpose.

CephFS
cephfs
cephfs-hadoop
compile Alluxio
download the binaries locally
ceph packages install
http://localhost:19999
Command Line Interface
Mounting CephFS
Command Line Interface
Command Line Interface
Mounting CephFS
Command Line Interface