This guide describes how to configure Alluxio with CephFS as the under storage system. Alluxio supports two different implementations of under storage system for CephFS :
Prerequisites
Deploy Alluxio binary package
The Alluxio binaries must be on your machine. You can either compile Alluxio , or download the binaries locally .
Install Dependences
According to ceph packages install to install below packages:
Copy cephfs-java
libcephfs_jni
libcephfs2
Make symbolic links
Copy $ ln -s /usr/lib64/libcephfs_jni.so.1.0.0 /usr/lib64/libcephfs_jni.so
$ ln -s /usr/lib64/libcephfs.so.2.0.0 /usr/lib64/libcephfs.so
$ java_path=`which java | xargs readlink | sed 's#/bin/java##g'`
$ ln -s /usr/share/java/libcephfs.jar $java_path/jre/lib/ext/libcephfs.jar
Download CephFS Hadoop jar
Copy $ curl -o $java_path/jre/lib/ext/hadoop-cephfs.jar -s https://download.ceph.com/tarballs/hadoop-cephfs.jar
Basic Setup
Configure Alluxio to use under storage systems by modifying conf/alluxio-site.properties
and conf/core-site.xml
. If them do not exist, create the configuration files from the templates
Copy $ cp conf/alluxio-site.properties.template conf/alluxio-site.properties
$ cp conf/core-site.xml.template conf/core-site.xml
cephfsModify conf/alluxio-site.properties
to include:
Copy alluxio.underfs.cephfs.conf.file=<ceph-conf-file>
alluxio.underfs.cephfs.mds.namespace=<ceph-fs-name>
alluxio.underfs.cephfs.mount.point=<ceph-fs-dir>
alluxio.underfs.cephfs.auth.id=<client-id>
alluxio.underfs.cephfs.auth.keyring=<client-keyring-file>
cephfs-hadoopModify conf/alluxio-site.properties
to include:
Copy alluxio.underfs.hdfs.configuration=${ALLUXIO_HOME}/conf/core-site.xml
Modify conf/core-site.xml
to include:
Copy <configuration>
<property>
<name>fs.default.name</name>
<value>ceph://mon1,mon2,mon3/</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>ceph://mon1,mon2,mon3/</value>
</property>
<property>
<name>ceph.data.pools</name>
<value>${data-pools}</value>
</property>
<property>
<name>ceph.auth.id</name>
<value>${client-id}</value>
</property>
<property>
<name>ceph.conf.options</name>
<value>client_mount_gid=${gid},client_mount_uid=${uid},client_mds_namespace=${ceph-fs-name}</value>
</property>
<property>
<name>ceph.root.dir</name>
<value>${ceph-fs-dir}</value>
</property>
<property>
<name>ceph.mon.address</name>
<value>mon1,mon2,mon3</value>
</property>
<property>
<name>fs.AbstractFileSystem.ceph.impl</name>
<value>org.apache.hadoop.fs.ceph.CephFs</value>
</property>
<property>
<name>fs.ceph.impl</name>
<value>org.apache.hadoop.fs.ceph.CephFileSystem</value>
</property>
<property>
<name>ceph.auth.keyring</name>
<value>${client-keyring-file}</value>
</property>
</configuration>
Running Alluxio Locally with CephFS
Start up Alluxio locally to see that everything works.
Copy $ ./bin/alluxio format
$ ./bin/alluxio-start.sh local
This should start an Alluxio master and Alluxio worker. You can see the master UI at http://localhost:19999 .
cephfsAn CephFS location can be mounted at a nested directory in the Alluxio namespace to have unified access to multiple under storage systems. Alluxio's Command Line Interface can be used for this purpose.
Issue the following command to use the ufs cephfs:
Copy $ ./bin/alluxio fs mkdir /mnt/cephfs
$ ./bin/alluxio fs mount /mnt/cephfs cephfs://mon1\;mon2\;mon3/
Run a simple example program:
Copy $ ./bin/alluxio runTests --path cephfs://mon1\;mon2\;mon3/
Visit your cephfs to verify the files and directories created by Alluxio exist.
You should see files named like: In cephfs, you can visit cephfs with ceph-fuse or mount by POSIX APIs. Mounting CephFS
Copy ${ceph-fs-dir}/default_tests_files/Basic_CACHE_THROUGH
In Alluxio, you can visit the nested directory in the Alluxio. Alluxio's Command Line Interface can be used for this purpose.
Copy /mnt/cephfs/default_tests_files/Basic_CACHE_THROUGH
cephfs-hadoopAn CephFS location can be mounted at a nested directory in the Alluxio namespace to have unified access to multiple under storage systems. Alluxio's Command Line Interface can be used for this purpose.
Issue the following command to use the ufs cephfs:
Copy $ ./bin/alluxio fs mkdir /mnt/cephfs-hadoop
$ ./bin/alluxio fs mount /mnt/cephfs-hadoop ceph://mon1\;mon2\;mon3/
Run a simple example program:
Copy ./bin/alluxio runTests --path cephfs://mon1\;mon2\;mon3/
Visit your cephfs to verify the files and directories created by Alluxio exist.
You should see files named like: In cephfs, you can visit cephfs with ceph-fuse or mount by POSIX APIs. Mounting CephFS
Copy ${ceph-fs-dir}/default_tests_files/Basic_CACHE_THROUGH
In Alluxio, you can visit the nested directory in the Alluxio. Alluxio's Command Line Interface can be used for this purpose.
Copy /mnt/cephfs-hadoop/default_tests_files/Basic_CACHE_THROUGH
Last updated 3 months ago