# Amazon S3

This guide describes the instructions to configure [Amazon AWS S3](https://aws.amazon.com/s3/) as Alluxio's under storage system. Amazon AWS S3, or Amazon Simple Storage Service, is an object storage service offering industry-leading scalability, data availability, security, and performance. For more information about Amazon AWS S3, please read its [documentation](https://docs.aws.amazon.com/s3/index.html).

S3 compatible storages are also supported. See [S3 compatible storages](https://documentation.alluxio.io/ee-ai-en/ufs/s3-compatible) for specific examples.

## Prerequisites

Before you get started, please ensure you have the required information listed below:

In preparation for using Amazon AWS S3 with Alluxio:

| `<S3_BUCKET>`        | \[Create a new S3 bucket]<https://docs.ceph.com/en/quincy/radosgw/s3/bucketops/>) or use an existing bucket                                                                                    |
| -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `<S3_DIRECTORY>`     | The directory you want to use in that container, either by creating a new directory or using an existing one.                                                                                  |
| `<S3_ACCESS_KEY_ID>` | Used to sign programmatic requests made to AWS. See [How to Obtain Access Key ID and Secret Access Key](https://docs.aws.amazon.com/powershell/latest/userguide/pstools-appendix-sign-up.html) |
| `<S3_SECRET_KEY>`    | Used to sign programmatic requests made to AWS. See [How to Obtain Access Key ID and Secret Access Key](https://docs.aws.amazon.com/powershell/latest/userguide/pstools-appendix-sign-up.html) |

## Basic Setup

Use the [mount table operations](https://documentation.alluxio.io/ee-ai-en/ufs) to add a new mount point, specifying the Alluxio path to create the mount on and the S3 path as the UFS URI. Credentials and configuration options can also be specified as part of the mount operation as described by [configuring mount points](https://documentation.alluxio.io/ee-ai-en/ufs).

{% tabs %}
{% tab title="Kubernetes (Operator)" %}
An example `ufs.yaml` to create a mount point with the operator:

```yaml
apiVersion: k8s-operator.alluxio.com/v1
kind: UnderFileSystem
metadata:
  name: alluxio-s3
  namespace: alx-ns
spec:
  alluxioCluster: alluxio-cluster
  path: s3://<S3_BUCKET>/<S3_DIRECTORY>
  mountPath: /s3
  mountOptions:
    s3a.accessKeyId: <S3_ACCESS_KEY_ID>
    s3a.secretKey: <S3_SECRET_KEY>
    alluxio.underfs.s3.region: <S3_REGION>
```

{% endtab %}

{% tab title="Docker / Bare-Metal" %}
An example command to mount `s3://<S3_BUCKET>/<S3_DIRECTORY>` to `/s3` if not using the operator:

```shell
bin/alluxio mount add --path /s3/ --ufs-uri s3://<S3_BUCKET>/<S3_DIRECTORY> \
  --option s3a.accessKeyId=<S3_ACCESS_KEY_ID> --option s3a.secretKey=<S3_SECRET_KEY>
```

Note that if you want to mount the root of the S3 bucket, add a trailing slash after the bucket name (e.g. `s3://S3_BUCKET/`).
{% endtab %}
{% endtabs %}

For other methods of setting AWS credentials, see the credentials section in [Advanced Setup](#advanced-credentials-setup).

### Mounting Multiple S3 Buckets

You can mount as many S3 buckets as needed. Each mount creates a separate virtual path in the Alluxio namespace and can use different credentials and regions.

{% tabs %}
{% tab title="Kubernetes (Operator)" %}
Create one `UnderFileSystem` CR per bucket:

```yaml
# bucket-a.yaml
apiVersion: k8s-operator.alluxio.com/v1
kind: UnderFileSystem
metadata:
  name: s3-bucket-a
  namespace: alx-ns
spec:
  alluxioCluster: alluxio-cluster
  path: s3://bucket-a-prod/
  mountPath: /bucket-a
  mountOptions:
    s3a.accessKeyId: "<KEY_A>"
    s3a.secretKey: "<SECRET_A>"
    alluxio.underfs.s3.region: "us-east-1"
```

```yaml
# bucket-b.yaml
apiVersion: k8s-operator.alluxio.com/v1
kind: UnderFileSystem
metadata:
  name: s3-bucket-b
  namespace: alx-ns
spec:
  alluxioCluster: alluxio-cluster
  path: s3://bucket-b-analytics/
  mountPath: /bucket-b
  mountOptions:
    s3a.accessKeyId: "<KEY_B>"
    s3a.secretKey: "<SECRET_B>"
    alluxio.underfs.s3.region: "us-west-2"
```

```shell
kubectl apply -f bucket-a.yaml -f bucket-b.yaml
```

{% endtab %}

{% tab title="Docker / Bare-Metal" %}
Run `alluxio mount add` once per bucket:

```shell
# Mount first bucket
bin/alluxio mount add \
  --path /bucket-a \
  --ufs-uri s3://bucket-a-prod/ \
  --option s3a.accessKeyId=<KEY_A> \
  --option s3a.secretKey=<SECRET_A> \
  --option alluxio.underfs.s3.region=us-east-1

# Mount second bucket (different region)
bin/alluxio mount add \
  --path /bucket-b \
  --ufs-uri s3://bucket-b-analytics/ \
  --option s3a.accessKeyId=<KEY_B> \
  --option s3a.secretKey=<SECRET_B> \
  --option alluxio.underfs.s3.region=us-west-2
```

{% endtab %}
{% endtabs %}

Each mount path must be unique. There is no hard limit on the number of mounts. Per-mount credentials take precedence over any global credentials configured in `AlluxioCluster` `spec.properties`.

## Advanced Setup

Note that configuration options can be specified as mount options or as configuration properties in `conf/alluxio-site.properties`. The following sections will describe how to set configurations as properties, but they can also be set as mount options via `--option <key>=<value>`.

### Configure AWS SDK v1

Configure the AWS SDK version when accessing S3 buckets. The default version used is v2. If you want to set the version to be v1, please add the following configuration in `conf/alluxio-site.properties`

```properties
alluxio.underfs.s3.sdk.version=1
```

Note that AWS SDK v2 has the better memory management and higher throughput performance.

### Configure S3 Region

Configuring the S3 region when accessing S3 buckets will improve performance. Otherwise, global S3 bucket access will be enabled which introduces extra requests. S3 region can be set with the property `alluxio.underfs.s3.region`.

```properties
alluxio.underfs.s3.region=us-west-1
```

Note that if the [S3 endpoint](#specify-an-endpoint) is set, this property is ignored in favor of the endpoint specific region property.

### Advanced Credentials Setup

If you are using AWS S3 as your UFS, you need to configure credentials for it. There are several ways to provide credentials:

* **Direct Keys**: Specify `s3a.accessKeyId` and `s3a.secretKey`. Alluxio will use this key pair to access the S3 UFS directly.
* **AssumeRole with Keys**: If you specify `aws.accessKeyId` and `aws.secretKey` and also enable the AssumeRole feature, Alluxio will use this key pair to obtain temporary credentials through AssumeRole to access the S3 UFS. See [Using AWS AssumeRole](#using-aws-assumerole) for more details.
* **Credential Provider Class**: If you specify `alluxio.underfs.s3.credential.provider.class`, Alluxio will obtain credentials based on the provider you specify. This is useful for more complex authentication scenarios. The supported providers are:
  * `WEBIDENTITY_TOKEN`: Uses a Web Identity Token File to get credentials. This is commonly used in containerized environments like Kubernetes where a service account can assume an IAM role.
  * `PROFILE`: Uses a named profile from your `~/.aws/credentials` file. This method is ideal for development environments or situations where you need to manage multiple IAM user roles and switch between them easily.
  * `INSTANCE_PROFILE`: Uses an IAM Instance Profile to obtain credentials. This is typically used on AWS compute services like EC2 instances or in EKS environments, allowing applications to securely access AWS services without managing explicit credentials within the application.

You can specify credentials in different ways, from highest to lowest priority:

1. `s3a.accessKeyId` and `s3a.secretKey` specified as mount options
2. `s3a.accessKeyId` and `s3a.secretKey` specified as Java system properties
3. `s3a.accessKeyId` and `s3a.secretKey` specified in `alluxio-site.properties`
4. `alluxio.underfs.s3.credential.provider.class` specified in the configuration
5. Environment variables `AWS_ACCESS_KEY_ID` (or `AWS_ACCESS_KEY`) and `AWS_SECRET_ACCESS_KEY` (or `AWS_SECRET_KEY`) on Alluxio servers
6. Profile file containing credentials at `~/.aws/credentials`
7. AWS Instance profile credentials, if you are using an EC2 instance

#### Setting Credentials in Kubernetes

Options A and B below are general Kubernetes patterns that apply to all UFS types — see [Kubernetes: Credential Management](https://documentation.alluxio.io/ee-ai-en/ufs/..#kubernetes-credential-management) in the UFS overview for the generic form. Option C is specific to AWS/EKS.

When running Alluxio on Kubernetes via the Operator, there are three additional ways to set credentials that avoid specifying them on every mount command.

**Option A: `spec.properties` in the AlluxioCluster CR (recommended for simplicity)**

Set credentials as global properties in the `AlluxioCluster` CR. All S3 mounts that do not specify their own credentials will inherit these:

```yaml
apiVersion: k8s-operator.alluxio.com/v1
kind: AlluxioCluster
spec:
  properties:
    s3a.accessKeyId: "<YOUR_ACCESS_KEY>"
    s3a.secretKey: "<YOUR_SECRET_KEY>"
    alluxio.underfs.s3.region: "us-east-1"
```

With this in place, `UnderFileSystem` CRs no longer need `mountOptions` for credentials. Per-mount `mountOptions` still take precedence when specified, which is useful for cross-account buckets.

**Option B: Environment variables on coordinator and worker pods**

Store credentials in a Kubernetes Secret to avoid committing them to version control:

```shell
kubectl create secret generic s3-credentials \
  --from-literal=access-key-id=<YOUR_ACCESS_KEY> \
  --from-literal=secret-access-key=<YOUR_SECRET_KEY> \
  -n alx-ns
```

Then reference the Secret in the `AlluxioCluster` CR:

```yaml
apiVersion: k8s-operator.alluxio.com/v1
kind: AlluxioCluster
spec:
  coordinator:
    env:
      - name: AWS_ACCESS_KEY_ID
        valueFrom:
          secretKeyRef:
            name: s3-credentials
            key: access-key-id
      - name: AWS_SECRET_ACCESS_KEY
        valueFrom:
          secretKeyRef:
            name: s3-credentials
            key: secret-access-key
  worker:
    env:
      - name: AWS_ACCESS_KEY_ID
        valueFrom:
          secretKeyRef:
            name: s3-credentials
            key: access-key-id
      - name: AWS_SECRET_ACCESS_KEY
        valueFrom:
          secretKeyRef:
            name: s3-credentials
            key: secret-access-key
```

**Option C: IAM Node Role (recommended for EKS)**

Attach an IAM policy with S3 access to the EKS node group role. No credentials are needed in any configuration when this is set up correctly. See [AWS documentation](https://docs.aws.amazon.com/eks/latest/userguide/create-node-role.html) for details.

When using an AWS Instance profile as the credentials' provider:

* Create an [IAM Role](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html) with access to the mounted bucket
* Create an [Instance profile](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html#ec2-instance-profile) as a container for the defined IAM Role
* Launch an EC2 instance using the created profile

Note that the IAM role will need access to both the files in the bucket as well as the bucket itself in order to determine the bucket's owner. Automatically assigning an owner to the bucket can be avoided by setting the property `alluxio.underfs.s3.inherit.acl=false`.

See [Amazon's documentation](http://docs.aws.amazon.com/java-sdk/latest/developer-guide/credentials.html#id6) for more details.

### HTTPS/HTTP access

By default, Alluxio use HTTPS protocol for secure to communication with S3. If you need to disable SSL cert validation and SSL host name verification, configure the following property:

```properties
alluxio.underfs.s3.secure.http.trust.all.certs=true
```

If you want to use HTTP protocol to communicate with S3, configure the following property:

```properties
alluxio.underfs.s3.secure.http.enabled=false
```

### Enabling Server Side Encryption

You may encrypt your data stored in S3. The encryption is only valid for data at rest in S3 and will be transferred in decrypted form when read by clients. Note, enabling this will also enable HTTPS to comply with requirements for reading/writing objects.

Enable this feature by configuring `conf/alluxio-site.properties`:

```properties
alluxio.underfs.s3.server.side.encryption.enabled=true
```

### DNS-Buckets

By default, a request directed at the bucket named "mybucket" will be sent to the host name "mybucket.s3.amazonaws.com". You can enable DNS-Buckets to use path style data access, for example: "<http://s3.amazonaws.com/mybucket>" by setting the following configuration:

```properties
alluxio.underfs.s3.disable.dns.buckets=true
```

### Accessing S3 through a proxy

To communicate with S3 through a proxy, modify `conf/alluxio-site.properties` to include:

```properties
alluxio.underfs.s3.proxy.host=<PROXY_HOST>
alluxio.underfs.s3.proxy.port=<PROXY_PORT>
```

`<PROXY_HOST>` and `<PROXY_PORT>` should be replaced by the host and port of your proxy. The proxy is accessed using HTTPS protocol by default; if the proxy only supports HTTP, be sure to set `alluxio.underfs.s3.secure.http.enabled=false`.

### Specify an endpoint

If you want to access a specific endpoint such as AWS VPC endpoint, modify `conf/alluxio-site.properties` to include:

```properties
alluxio.underfs.s3.endpoint=<S3_ENDPOINT>
alluxio.underfs.s3.endpoint.region=<S3_ENDPOINT_REGION>
```

Both the endpoint and region value need to be set. Note that when an endpoint is set, `alluxio.underfs.s3.region=<S3_REGION>` will no longer take effect.

In the case of a non-Amazon service provider, set the hostname and port of the S3 service as the `<S3_ENDPOINT>`.

### Using v2 S3 Signatures

Some S3 service providers only support v2 signatures. For these S3 providers, you can enforce using the v2 signatures by setting the `alluxio.underfs.s3.signer.algorithm` to `S3SignerType`.

### \[Experimental] S3 streaming upload

S3 is an object store and because of this feature, the whole file is sent from client to worker, stored in the local disk temporary directory, and uploaded in the `close()` method by default.

To enable S3 streaming upload, you need to modify `conf/alluxio-site.properties` to include:

```properties
alluxio.underfs.s3.streaming.upload.enabled=true
```

The default upload process is safer but has the following issues:

* Slow upload time. The file has to be sent to Alluxio worker first and then Alluxio worker is responsible for uploading the file to S3. The two processes are sequential.
* The temporary directory must have the capacity to store the whole file.
* Slow `close()`. The execution time of `close()` method is proportional to the file size and inversely proportional to the bandwidth. That is O(FILE\_SIZE/BANDWIDTH).

The S3 streaming upload feature addresses the above issues and is based on the [S3 low-level multipart upload](https://docs.aws.amazon.com/AmazonS3/latest/dev/mpListPartsJavaAPI.html).

The S3 streaming upload has the following advantages:

* Shorter upload time. Alluxio worker uploads buffered data while receiving new data. The total upload time will be at least as fast as the default method.
* Smaller capacity requirement. Our data is buffered and uploaded according to partitions (`alluxio.underfs.s3.streaming.upload.partition.size` which is 64MB by default). When a partition is successfully uploaded, this partition will be deleted.
* Faster `close()`. We begin uploading data when data buffered reaches the partition size instead of uploading the whole file in `close()`.

If a S3 streaming upload is interrupted, there may be intermediate partitions uploaded to S3 and S3 will charge for the stored data. To reduce the charges, users can modify `conf/alluxio-site.properties` to include:

```properties
alluxio.underfs.cleanup.enabled=true
```

Intermediate multipart uploads in all non-readonly S3 mount points older than the clean age (configured by `alluxio.underfs.s3.intermediate.upload.clean.age`) will be cleaned when a cleanup interval (configured by `alluxio.underfs.cleanup.interval`) is reached.

### S3 multipart upload

We use multipart-upload method to upload one file by multiple parts, every part will be uploaded in one thread. It won't generate any temporary files while uploading. *It will consume more memory but faster than streaming upload mode*.

There are other parameters you can specify in `conf/alluxio-site.properties` to make the process faster and better.

```properties
# Timeout for uploading part when using multipart upload.
alluxio.underfs.object.store.multipart.upload.timeout
```

```properties
# Multipart upload partition size for S3. The default partition size is `16MB`
alluxio.underfs.s3.multipart.upload.partition.size
```

Disable S3 multipart upload, the upload method uploads one file completely from start to end in one go. you need to modify `conf/alluxio-site.properties` to include:

```properties
alluxio.underfs.s3.multipart.upload.enabled=false
```

### Setting Request Retry Policy

For retry configuration that applies to all object store UFS types, see [Request Retry Policy](https://documentation.alluxio.io/ee-ai-en/ufs/..#request-retry-policy) in the UFS overview.

### Setting Larger Timeout

If the S3 connection is slow, a larger timeout is useful:

```properties
alluxio.underfs.s3.socket.timeout=500sec
alluxio.underfs.s3.request.timeout=5min
```

### Tuning for High Concurrency

When accessing S3 through Alluxio with a large number of clients per Alluxio server, it is important to increase the S3 connection pool size to avoid performance issues. If the connection pool size is too small, it may result in S3 request failures with errors such as `"Unable to execute HTTP request: Timeout waiting for connection from pool"` due to high competition for available connections. Increasing the pool size ensures smoother communication and optimal performance by setting:

```properties
alluxio.underfs.s3.connections.max=2048
```

### Optimizing Metadata Listing with Breadcrumbs

Unlike traditional file systems, many object stores do not have a native concept of directories. Common filesystem operations (such as checking if a directory exists, retrieving directory metadata, or traversing directories) must be translated into prefix-based object listing operations.

When a directory contains a large number of objects and lacks directory placeholder objects (zero-sized objects with the directory's name as a prefix called "breadcrumbs"), these operations can become significantly more expensive. To mitigate this overhead, it is common to introduce placeholder objects to represent directories explicitly, improving the performance of directory-related metadata operations.

Alluxio provides an option to mitigate this overhead through the use of "breadcrumb" objects, which act as lightweight placeholders representing directories. By default, Alluxio will proactively create breadcrumb objects in the UFS to represent directories. To disable this feature, set the following configuration:

```properties
alluxio.underfs.object.store.breadcrumbs.enabled=false
```

The Breadcrumb has the following advantages:

* Improved Performance: Once breadcrumbs are created, directory metadata can be accessed efficiently, avoiding repeated full-listing scans.
* Caching Synergy: Works well with metadata caching and preloading strategies to optimize cold-start behavior.

And the Breadcrumb has the following things to be considered:

* Write Permissions Required: Enabling this feature requires write access to the object store.
* Potential Data Impact: Users concerned with data purity or external modification of the object store may prefer to disable it.
* Cold Load Overhead: Initial directory loads may incur a one-time write cost when breadcrumb objects are created.

Recommendation:

* Enable if the UFS allows writes and you are optimizing for performance (especially with object stores that exhibit poor listing performance).
* Disable if the UFS is read-only or modifying the UFS is not acceptable for compliance or auditing reasons.

## Identity and Access Control of S3 Objects

[S3 identity and access management](https://docs.aws.amazon.com/AmazonS3/latest/dev/s3-access-control.html) is very different from the traditional POSIX permission model. For instance, S3 ACL does not support groups or directory-level settings. Alluxio makes the best effort to inherit permission information including file owner, group and permission mode from S3 ACL information.

### Why is 403 Access Denied Error Returned

The S3 credentials set in Alluxio configuration corresponds to an AWS user. If this user does not have the required permissions to access an S3 bucket or object, a 403 permission denied error will be returned.

If you see a 403 error in Alluxio server log when accessing an S3 service, you should double-check

1. You are using the correct AWS credentials. See [credentials setup](#advanced-credentials-setup).
2. Your AWS user has permissions to access the buckets and objects mounted to Alluxio.

Read more [AWS troubleshooting guidance](https://aws.amazon.com/premiumsupport/knowledge-center/s3-troubleshoot-403/) for 403 error.

### File Owner and Group

Alluxio file system sets the file owner based on the AWS account configured in Alluxio to connect to S3. Since there is no group in S3 ACL, the owner is reused as the group.

By default, Alluxio extracts the display name of this AWS account as the file owner. In case this display name is not available, this AWS user's [canonical user ID](https://docs.aws.amazon.com/general/latest/gr/acct-identifiers.html) will be used. This canonical user ID is typically a long string (like `79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be`), thus often inconvenient to read and use in practice. Optionally, the property `alluxio.underfs.s3.owner.id.to.username.mapping` can be used to specify a preset mapping from canonical user IDs to Alluxio usernames, in the format "id1=user1;id2=user2". For example, edit `alluxio-site.properties` to include

```properties
alluxio.underfs.s3.owner.id.to.username.mapping=\
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be=john
```

This configuration helps Alluxio recognize all objects owned by this AWS account as owned by the user `john` in Alluxio namespace. To find out the AWS S3 canonical ID of your account, check the console `https://console.aws.amazon.com/iam/home?#/security_credentials`, expand the "Account Identifiers" tab and refer to "Canonical User ID".

### Changing Permissions

`chown`, `chgrp`, and `chmod` of Alluxio directories and files do **NOT** propagate to the underlying S3 buckets nor objects.

### Using AWS AssumeRole

Alluxio supports authentication via the [AWS AssumeRole API](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html) to connect to AWS S3. When AssumeRole is enabled, the AWS access key and secret key will only be used to obtain temporary security credentials. All subsequent accesses will utilize these temporary credentials, which are generated through AssumeRole.

Additionally, AssumeRole access can also be achieved by configuring `alluxio.underfs.s3.credential.provider.class` with values like `WEBIDENTITY_TOKEN` or `INSTANCE_PROFILE`, as detailed in the [Advanced Credentials Setup](#advanced-credentials-setup) section.

To enable AssumeRole in Alluxio, the following properties are required on workers and coordinators:

```properties
alluxio.underfs.s3.assumerole.enabled=true
alluxio.underfs.s3.assumerole.rolearn=arn:aws:iam::123456:role/example-role
```

Note: Ensure the specified role exists, and the user associated with the provided access key and secret key has permission to assume the role defined by the target role ARN.

In addition to the mandatory properties, you can also configure the following optional settings for greater control over session behavior and network configurations:

```properties
# Specifies a name for the session.
# Temporary credentials will be associated with this session.
# A random string is suffixed to ensure uniqueness.
alluxio.underfs.s3.assumerole.session.prefix="alluxio-assume-role"

# Specifies the session duration in seconds. Typically, this is between 900 and 3600 seconds.
# The session will be automatically refreshed by the AWS client,
# so no manual intervention is needed to refresh the temporary credentials.
alluxio.underfs.s3.assumerole.session.duration.second=900

# Enables the HTTPS protocol for AssumeRole requests. The default value is true.
alluxio.underfs.s3.assumerole.https.enabled=true

# Enables the HTTPS protocol for the AssumeRole proxy. The default value is false.
alluxio.underfs.s3.assumerole.proxy.https.enabled=false

# Specifies the proxy host for AssumeRole requests. Both proxy host and proxy port must be set
# in the Alluxio configuration; otherwise, the proxy settings will default to your system's
# environment configuration.
alluxio.underfs.s3.assumerole.proxy.host=<HOSTNAME>

# Specifies the proxy port for AssumeRole requests. Both proxy host and proxy port must be set
# in the Alluxio configuration; otherwise, the proxy settings will default to your system's
# environment configuration.
alluxio.underfs.s3.assumerole.proxy.port=<PORT_NUMBER>
```

Note: If the proxy host and port are not set in the Alluxio configuration, the JVM/System environment variables `HTTP(S)_PROXY`, `http(s)_proxy`, `http(s).proxyHost`, and `http(s).proxyPort` will automatically be picked up by the AWS SDK.

Below is a sample configuration for setting up AssumeRole in Alluxio:

```properties
aws.accessKeyId=FOO
aws.secretKey=BAR
alluxio.underfs.s3.assumerole.enabled=true
alluxio.underfs.s3.assumerole.session.duration.second=1000
alluxio.underfs.s3.assumerole.session.prefix="alluxio"
alluxio.underfs.s3.assumerole.rolearn=arn:aws:iam::123456:role/example-role
```

Summary:

* **Temporary Credentials**: AWS access keys are only used to request temporary credentials; all future operations rely on those credentials.
* **Automatic Session Refresh**: Sessions are automatically refreshed by the AWS SDK, requiring no manual intervention.
* **Customizable Configuration**: You can modify session duration, proxy settings, and session prefixes to suit your security and environment needs.

By setting up these properties, Alluxio can effectively authenticate and manage access to AWS S3 using the temporary credentials obtained via AssumeRole.

## Troubleshooting

### Enabling AWS-SDK Debug Level

If issues are encountered when running against your S3 backend using AWS SDK V2, enable additional logging to track HTTP traffic. set `ALLUXIO_WORKER_JAVA_OPTS` in `conf/alluxio-env.sh`:

```properties
ALLUXIO_WORKER_JAVA_OPTS+=" -Daws.crt.log.level=Info  -Daws.crt.log.destination=File  -Daws.crt.log.filename=/opt/alluxio/logs/worker.log"
```

See [Amazon's documentation](https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/logging-slf4j.html) for more details.

If you are using AWS SDK V1, please modify `conf/log4j.properties` to add the following properties:

```properties
log4j.logger.com.amazonaws=WARN
log4j.logger.com.amazonaws.request=DEBUG
log4j.logger.org.apache.http.wire=DEBUG
```

See [Amazon's documentation](https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/java-dg-logging.html) for more details.

### Prevent Creating Zero-byte Files

Alluxio may create zero-byte files in S3 as a performance optimization when listing the contents of the underlying storage. If a bucket is mounted with read-only access, creating zero-byte file creation via S3 PUT operation will be disallowed. To disable this optimization, set the following configuration.

```properties
alluxio.underfs.object.store.breadcrumbs.enabled=false
```
