Benchmarking S3 API Performance

COSBench (Cloud Object Storage Benchmark) is an open-source tool developed by Intel for stress testing object storage systems. Since Alluxio exposes an S3-compatible REST API, you can use COSBench to measure its read and write performance for S3-based workloads.

This guide explains how to set up and run COSBench to perform end-to-end performance tests against an Alluxio cluster.

Performance Highlights

The following results were achieved using a 4-driver COSBench cluster testing a 4-worker Alluxio cluster. The tests measured read throughput against data fully cached in Alluxio.

Large File Read Throughput (1GB files)

Concurrency per Driver

Total Throughput

1 Thread

2.35 GB/s

16 Threads

20.44 GB/s

128 Threads

36.94 GB/s

Small File Read IOPS (100KB files)

Concurrency per Driver

Total Throughput

Total Operations/sec

1 Thread

50.26 MB/s

502 op/s

16 Threads

1.10 GB/s

11,302 op/s

128 Threads

4.69 GB/s

46,757 op/s

Test Environment

The benchmark results were generated using the following environment, with all instances deployed in the same AWS availability zone.

COSBench Cluster:
- 1 Controller Node (c5n.metal)
- 4 Driver Nodes (c5n.metal)
Alluxio Cluster:
- 1 Coordinator Node
- 4 Worker Nodes (i3en.metal with 8 NVMe SSDs each)
Load Balancer:
- An AWS Elastic Load Balancer (ELB) was configured to distribute S3 requests evenly across the four Alluxio workers.

Setup and Configuration

Prerequisites

Operating System: It is recommended to run COSBench on CentOS 7 or later.
Pre-loaded Data: For read benchmarks, ensure your test dataset is fully loaded into the Alluxio cache from the underlying UFS.

COSBench Installation

On all COSBench controller and driver nodes:

Download COSBench: Download COSBench version 0.4.2.c4 and unzip it.

Install Dependencies:

sudo yum install nmap-ncat curl java-1.8.0-openjdk-devel -y

Disable MD5 Validation: Edit the cosbench-start.sh script and add the following Java property to disable MD5 validation for S3 GET requests. This is necessary for compatibility with Alluxio's S3 API.
```
-Dcom.amazonaws.services.s3.disableGetObjectMD5Validation=true
```
Start COSBench Services: From the COSBench root directory, start the controller and all drivers.
```
sudo bash start-all.sh
```

Running the Benchmark

1. Configure the Workload

Create an XML file (e.g., s3-benchmark.xml) to define the test workload. A COSBench workload consists of several stages:

init: Creates the test buckets.
prepare: Writes the initial data that will be used for testing.
main: The primary testing stage. Runs the specified mix of read/write operations for a set duration.
cleanup: Deletes the objects created during the test.
dispose: Deletes the buckets.

Note: If you are using Alluxio mount points, you cannot create new buckets via the S3 API. You must skip the init and dispose stages and use pre-existing buckets that match your mount configuration.

Example: Basic Read/Write Workload

This example shows a simple workload that creates two buckets, writes 10 objects of 64KB to each, runs a 30-second test with an 80/20 read-write ratio, and then cleans up.

<?xml version="1.0" encoding="UTF-8" ?>
<workload name="s3-sample" description="sample benchmark for s3">
  <storage type="s3" config="accesskey=root;secretkey=dump;endpoint=http://localhost:29998;path_style_access=true" />

  <workflow>
    <workstage name="init">
      <work type="init" workers="1" config="cprefix=s3testqwer;containers=r(1,2)" />
    </workstage>

    <workstage name="prepare">
      <work type="prepare" workers="1" config="cprefix=s3testqwer;containers=r(1,2);objects=r(1,10);sizes=c(64)KB" />
    </workstage>

    <workstage name="main">
      <work name="main" workers="8" runtime="30">
        <operation type="read" ratio="80" config="cprefix=s3testqwer;containers=u(1,2);objects=u(1,10)" />
        <operation type="write" ratio="20" config="cprefix=s3testqwer;containers=u(1,2);objects=u(11,20);sizes=c(64)KB" />
      </work>
    </workstage>

    <workstage name="cleanup">
      <work type="cleanup" workers="1" config="cprefix=s3testqwer;containers=r(1,2);objects=r(1,20)" />
    </workstage>

    <workstage name="dispose">
      <work type="dispose" workers="1" config="cprefix=s3testqwer;containers=r(1,2)" />
    </workstage>
  </workflow>
</workload>

Example: High-Concurrency Read Test

This example prepares 10,000 small objects (100KB) and then uses four drivers, each with 128 worker threads, to read the objects concurrently for 300 seconds.

<?xml version="1.0" encoding="UTF-8" ?>
<workload name="s3-sample" description="sample benchmark for s3">
  <storage type="s3" config="accesskey=root;secretkey=dump;endpoint=http://<ip>:29998;path_style_access=true" />
  <workflow>
    <workstage name="prepare">
      <work type="prepare" workers="100" config="cprefix=ufs;containers=r(2,2);oprefix=myobjects;osuffix=.jpg;objects=r(1,10000);sizes=c(100)KB" />
    </workstage>

    <workstage name="128">
      <work name="read" workers="128" driver="driver1" runtime="300">
        <operation type="read" ratio="100" config="cprefix=ufs;containers=r(2,2);oprefix=myobjects;osuffix=.jpg;objects=u(1,10000)" />
      </work>
      <work name="read" workers="128" driver="driver2" runtime="300">
        <operation type="read" ratio="100" config="cprefix=ufs;containers=r(2,2);oprefix=myobjects;osuffix=.jpg;objects=u(1,10000)" />
      </work>
      <work name="read" workers="128" driver="driver3" runtime="300">
        <operation type="read" ratio="100" config="cprefix=ufs;containers=r(2,2);oprefix=myobjects;osuffix=.jpg;objects=u(1,10000)" />
      </work>
      <work name="read" workers="128" driver="driver4" runtime="300">
        <operation type="read" ratio="100" config="cprefix=ufs;containers=r(2,2);oprefix=myobjects;osuffix=.jpg;objects=u(1,10000)" />
      </work>
    </workstage>
  </workflow>
</workload>

2. Submit the Workload

Use the cli.sh script to submit your workload XML file to the COSBench controller.

bash cli.sh submit conf/s3-benchmark.xml

3. Monitor the Results

You can monitor the status and view the results of your benchmark jobs from the COSBench web interface, available at http://<CONTROLLER_IP>:19088/controller/index.html.

4. Stop COSBench Services

Once you have completed your tests, stop the services.

sudo bash stop-all.sh

Last updated 2 months ago