Alluxio
ProductsLanguageHome
  • Introduction
  • Overview
    • Architecture
    • Job Service
    • Quick Start Guide
    • FAQ
    • Use Cases
  • Core Services
    • Caching
    • Unified Namespace
  • Install Alluxio
    • Local Machine
    • Cluster
    • Cluster with HA
    • Docker
    • Software Requirements
  • Kubernetes
    • Deploy
    • Spark on Kubernetes
    • Metrics
  • Cloud Native
    • Alibaba Cloud ACK
    • AWS EMR
    • Tencent EMR
    • Google Dataproc
  • Compute Integration
    • Apache Spark
    • Apache Hadoop MapReduce
    • Apache Flink
    • Apache Hive
    • Presto on Iceberg (Experimental)
    • Presto
    • Trino
    • Tensorflow
  • Storage Integrations
    • Amazon AWS S3
    • HDFS
    • Azure Blob Store
    • Azure Data Lake Storage Gen2
    • Azure Data Lake Storage
    • Google Cloud Storage
    • Qiniu Kodo
    • COSN
    • CephObjectStorage
    • MinIO
    • NFS
    • Aliyun Object Storage Service
    • Ozone
    • Swift
    • WEB
    • CephFS
  • Security
  • Operations
    • Configuration Settings
    • User CLI
    • Admin CLI
    • Web UI
    • Journal Management
    • Metastore Management
    • Metrics
  • Administration
    • Troubleshooting
    • Basic Logging
    • Remote Logging
    • Performance Tuning
    • Scalability Tuning
    • StressBench (Experimental)
    • Upgrading
  • Solutions
  • Client APIs
    • Java API
    • S3 API
    • REST API
    • POSIX API
  • Contributor Resources
    • Building Alluxio From Source
    • Contribution Guide
    • Code Conventions
    • Documentation Conventions
    • Contributor Tools
  • Reference
    • List Of Configuration Properties
    • List of Metrics
  • REST API
    • Master REST API
    • Worker REST API
    • Proxy REST API
    • Job REST API
  • Javadoc
Powered by GitBook
On this page
  • Prerequisites
  • Basic Setup
  • Root Mount Point
  • Nested Mount Point
  • Running Alluxio Locally with Ceph
  • Advanced Setup
  • Access Control
  1. Storage Integrations

CephObjectStorage

Last updated 6 months ago

This guide describes how to configure Alluxio with Ceph Object Storage as the under storage system. Alluxio supports two different clients APIs to connect to using :

  • (preferred)

Prerequisites

The Alluxio binaries must be on your machine. You can either , or .

Basic Setup

A Ceph bucket can be mounted to Alluxio either at the root of the namespace, or at a nested directory.

Root Mount Point

Configure Alluxio to use under storage systems by modifying conf/alluxio-site.properties. If it does not exist, create the configuration file from the template.

$ cp conf/alluxio-site.properties.template conf/alluxio-site.properties

Option 1: S3 Interface (preferred)

Modify conf/alluxio-site.properties to include:

alluxio.master.mount.table.root.ufs=s3://<bucket>/<folder>
alluxio.master.mount.table.root.option.s3a.accessKeyId=<access-key>
alluxio.master.mount.table.root.option.s3a.secretKey=<secret-key>
alluxio.master.mount.table.root.option.alluxio.underfs.s3.endpoint=http://<rgw-hostname>:<rgw-port>
alluxio.master.mount.table.root.option.alluxio.underfs.s3.disable.dns.buckets=true
alluxio.master.mount.table.root.option.alluxio.underfs.s3.inherit.acl=<inherit-acl>

If using a Ceph release such as hammer (or older) specify alluxio.underfs.s3.signer.algorithm=S3SignerType to use v2 S3 signatures. To use GET Bucket (List Objects) Version 1 specify alluxio.underfs.s3.list.objects.v1=true.

Option 2: Swift Interface

Modify conf/alluxio-site.properties to include:

alluxio.master.mount.table.root.ufs=swift://<bucket>/<folder>
alluxio.master.mount.table.root.option.fs.swift.user=<swift-user>
alluxio.master.mount.table.root.option.fs.swift.tenant=<swift-tenant>
alluxio.master.mount.table.root.option.fs.swift.password=<swift-user-password>
alluxio.master.mount.table.root.option.fs.swift.auth.url=<swift-auth-url>
alluxio.master.mount.table.root.option.fs.swift.auth.method=<swift-auth-method>

Replace <bucket>/<folder> with an existing Swift container location. Possible values of <swift-use-public> are true, false. Specify <swift-auth-model> as swiftauth if using native Ceph RGW authentication and <swift-auth-url> as http://<rgw-hostname>:<rgw-port>/auth/1.0.

Nested Mount Point

Issue the following command to use the S3 interface:

$ ./bin/alluxio fs mount \
  --option s3a.accessKeyId=<CEPH_ACCESS_KEY_ID> \
  --option s3a.secretKey=<CEPH_SECRET_ACCESS_KEY> \
  --option alluxio.underfs.s3.endpoint=<HTTP_ENDPOINT> \
  --option alluxio.underfs.s3.disable.dns.buckets=true \
  --option alluxio.underfs.s3.inherit.acl=false \
  /mnt/ceph s3://<BUCKET>/<FOLDER>

Similarly, to use the Swift interface:

$ ./bin/alluxio fs mount \
  --option fs.swift.user=<SWIFT_USER> \
  --option fs.swift.tenant=<SWIFT_TENANT> \
  --option fs.swift.password=<SWIFT_PASSWORD> \
  --option fs.swift.auth.url=<AUTH_URL> \
  --option fs.swift.auth.method=<AUTH_METHOD> \
  /mnt/ceph swift://<BUCKET>/<FOLDER>

Running Alluxio Locally with Ceph

Start up Alluxio locally to see that everything works.

$ ./bin/alluxio format
$ ./bin/alluxio-start.sh local

Run a simple example program:

$ ./bin/alluxio runTests

Visit your bucket to verify the files and directories created by Alluxio exist.

You should see files named like:

<bucket>/<folder>/default_tests_files/Basic_CACHE_THROUGH

To stop Alluxio, run:

$ ./bin/alluxio-stop.sh local

Advanced Setup

Access Control

An Ceph location can be mounted at a nested directory in the Alluxio namespace to have unified access to multiple under storage systems. Alluxio's can be used for this purpose.

This should start an Alluxio master and an Alluxio worker. You can see the master UI at .

If Alluxio security is enabled, Alluxio enforces the access control inherited from underlying Ceph Object Storage. Depending on the interace used, refer to or for more information.

Ceph Object Storage
Rados Gateway
S3
Swift
compile Alluxio
download the binaries locally
Command Line Interface
http://localhost:19999
Swift Access Control
S3 Access Control