Release Notes

AI-3.5-10.0.0

We are excited to announce the release of Alluxio AI 3.5! This version introduces several new features designed to enhance a few important areas, such as write performance (experimental), Python SDK (experimental), the S3 API, and last but not least, security. Below are the key highlights.

New Features

[Experimental] New Python SDK and FSSpec integration

The Alluxio Python Filesystem API based on FSSpec provides easy integrations with Ray, Pytorch, and Pyarrow.

[Experimental] CACHE_ONLY write mode to improve writing performance

Previously, Alluxio’s data writing performance was limited by the underlying file system (UFS).

To address this limitation, Alluxio offers a new CACHE_ONLY write ability to temporarily store data only in Alluxio cache and remove the dependency from UFS on data writing. As a result, CACHE_ONLY write mode can provide higher and more scalable write performance. This feature will benefit workloads requiring high write performance like checkpoint saving during AI model training.

Note that because data is not written to the UFS, the durability of this data during a system outage is not guaranteed and therefore Alluxio CACHE_ONLY mode should not be used as persistent storage. We recommend CACHE_ONLY be used only for certain workloads which require higher data write throughput and can afford potential data loss, such as creating checkpoint files during model training and temp data during spark shuffle.

Please refer to Writing Temporary Files for more details.

[Experimental] Listing cache to improve directory listing performance

The Index Service is introduced as a caching service for directory listing, designed to provide high performance and scalability for large directories containing hundreds of millions of files and subdirectories. The index service can provide 3x ~ 5x faster large directory listing compared to listing directly on S3.

UFS traffic limiter

The UFS Read Rate Limiter feature allows users to configure a maximum bandwidth limit per second for UFS reads performed by a single worker. While brief spikes in bandwidth may occur, the average usage is maintained within the specified limit, ensuring controlled data flow to optimize resource utilization and maintain system stability. This feature is particularly useful for managing workloads by capping data processing rates.

Refer to UFS Bandwdith Limiting for more details. Note this feature is only supported in following UFS types: HDFS, S3, OSS, COS, and GCS

Heterogeneous worker specification and configuration

Alluxio Operator now supports configuring heterogeneous worker drives with different sizes. This will provide more flexibility to deploy Alluxio on heterogeneous environments. See the deployment specification in the installation page as well as the configuration option described by the worker management page.

S3 API enhancements

  • Support HTTP persistent connection (Keep-Alive)

    • HTTP persistent connection (also called HTTP keep-alive), is the idea of using a single TCP connection to send and receive multiple HTTP requests/responses, as opposed to opening a new connection for every single request/response pair. By supporting HTTP persistent connection in Alluxio S3 API, the IO latency of 4KB S3 ReadObject will be decreased by about 40%.

  • Support TLS

  • Support multipart upload (MPU)

Security enhancements

  • Support TLS to secure cluster traffic

  • Address latest Common Vulnerabilities and Exposures (CVEs)

    • Several critical CVEs are resolved by removing or upgrading their corresponding packages, such as:

      • Log4j: Explicitly excluded any log4j 1.x versions that were transitively picked up through various dependencies

      • Zookeeper: Removed and explicitly excluded from all dependencies, Hadoop related ones in particular

      • Jackson-databind: Upgraded to 2.24.1

Conclusion

Alluxio AI 3.5 brings powerful new features to enhance APIs, write performance, and security. We encourage users to explore these new enhancements to improve their data workflows. For a complete list of updates and improvements, please refer to our official documentation.

Thank you for your continued support!

Last updated