Baidu Object Storage

This guide describes how to configure Baidu Object Storage (BOS) as Alluxio's under storage system. Baidu Object Storage (BOS) provides stable, secure, efficient and highly scalable storage services.

Prerequisites

Before using BOS with Alluxio, follow the BOS Process for Getting Started to sign up for BOS and create a BOS bucket.

Before you get started, please ensure you have the required information listed below:

<BOS_BUCKET>

Create a new BOS bucket or use an existing bucket

<BOS_DIRECTORY>

The directory you want to use in the bucket, either by creating a new directory or using an existing one

<BOS_ACCESS_KEY_ID>

The Access Key ID for BOS, which are created and managed in the BOS AccessKey management console

<BOS_ACCESS_KEY_SECRET>

The Secret Access Key for BOS, which are created and managed in the BOS AccessKey management console

<BOS_ENDPOINT>

The internet endpoint of the bucket, which can be found in the bucket overview page with values like bj.bcebos.com and gz.bcebos.com. Available endpoints are listed in the Region与Endpoint.

<BOS_REGION>

The region where the bucket is located, such as cn-beijing or cn-guangzhou. Available regions are listed in the Region与Endpoint.

Basic Setup

For the general mount mechanism and UnderFileSystem CR field reference, see Underlying Storage.

An example ufs.yaml to create a BOS mount point with the operator:

apiVersion: k8s-operator.alluxio.com/v1
kind: UnderFileSystem
metadata:
  name: alluxio-bos
  namespace: alx-ns
spec:
  alluxioCluster: alluxio-cluster
  path: bos://<BOS_BUCKET>/<BOS_DIRECTORY>
  mountPath: /bos
  mountOptions:
    fs.bos.accessKeyId: <BOS_ACCESS_KEY>
    fs.bos.accessKeySecret: <BOS_ACCESS_KEY_SECRET>
    fs.bos.endpoint: <BOS_ENDPOINT>

Advanced Setup

Note that configuration options can be specified as mount options or as configuration properties in conf/alluxio-site.properties. The following sections will describe how to set configurations as properties, but they can also be set as mount options via --option <key>=<value>.

Enabling HTTPS

To enable the use of the HTTPS protocol for secure communication with BOS with an additional layer of security for data transfers, configure the following setting in conf/alluxio-site.properties:

BOS multipart upload

We use multipart-upload method to upload one file by multiple parts, every part will be uploaded in one thread. It won't generate any temporary files while uploading.

There are other parameters you can specify in conf/alluxio-site.properties to potentially speed up the upload.

Disable BOS multipart upload, the upload method uploads one file completely from start to end in one go. you need to modify conf/alluxio-site.properties to include:

Setting Request Retry Policy

For retry configuration that applies to all object store UFS types, see Request Retry Policy in the UFS overview.

High Concurrency Tuning

When integrating Alluxio with BOS, you can optimize performance by adjusting the following configurations:

  • alluxio.underfs.bos.connection.max: Controls the max connection number with BOS. Default value is 1024.

  • alluxio.underfs.bos.io.threads.num: Controls the IO thread with BOS. Default value is 256.

  • alluxio.underfs.bos.socket.timeout: Controls the socket timeout with BOS. Default value is 50 seconds.

  • alluxio.underfs.bos.connect.timeout: Controls the connection timeout with BOS. Default value is 50 seconds.

Last updated