COS
Table of Contents {:toc}
This guide describes how to configure Alluxio with Tencent COS (Cloud Object Storage) as the under storage system. Tencent Cloud Object Storage (COS) is a distributed storage service offered by Tencent Cloud for massive data and accessible via HTTP/HTTPS protocols. It can store massive amounts of data and features imperceptible bandwidth and capacity expansion, making it a perfect data pool for big data computation and analytics.
Prerequisites
Alluxio runs on multiple machines in cluster mode so its binary package needs to be deployed on the machines.
Before using COS with Alluxio, either create a new bucket or use an existing one. Additionally, identify the directory you wish to use within that bucket, whether by creating a new directory or selecting an existing one. For this guide, the COS bucket name is COS_ALLUXIO_BUCKET
, the directory within the bucket is COS_DATA
, and the bucket region is COS_REGION
.
Basic Setup
Alluxio unifies access to different storage systems through the unified namespace feature. COS UFS is used to access Tencent Cloud object storage and a COS location can be either mounted at the root of the Alluxio namespace or as a top-level directory.
To configure Alluxio to use COS as under storage, you will need to modify the configuration file conf/alluxio-site.properties
. To configure Alluxio, if this is your first time modifying the configuration, create the configuration file from the template located at conf/alluxio-site.properties.template
.
Edit conf/alluxio-site.properties
file to set the under storage address to the COS bucket and the COS directory you want to mount to Alluxio. For example, the under storage address can be cos://COS_ALLUXIO_BUCKET/
if you want to mount the whole bucket to Alluxio, or cos://COS_ALLUXIO_BUCKET/COS_DATA
if only the directory /COS_DATA
inside the cos bucket COS_ALLUXIO_BUCKET
is mapped to Alluxio.
Specify credentials for COS access by adding the following properties in conf/alluxio-site.properties
:
Advanced Setup
COS multipart upload
The default upload method uploads one file completely from start to end in one go. We use multipart-upload method to upload one file by multiple parts, every part will be uploaded in one thread. It won't generate any temporary files while uploading.
To enable COS multipart upload, you need to modify conf/alluxio-site.properties
to include:
There are other parameters you can specify in conf/alluxio-site.properties
to make the process faster and better.
Last updated