Alluxio
ProductsLanguageHome
  • Introduction
  • Overview
    • Architecture
    • Job Service
    • Quick Start Guide
    • FAQ
    • Use Cases
  • Core Services
    • Caching
    • Unified Namespace
  • Install Alluxio
    • Local Machine
    • Cluster
    • Cluster with HA
    • Docker
    • Software Requirements
  • Kubernetes
    • Deploy
    • Spark on Kubernetes
    • Metrics
  • Cloud Native
    • Alibaba Cloud ACK
    • AWS EMR
    • Tencent EMR
    • Google Dataproc
  • Compute Integration
    • Apache Spark
    • Apache Hadoop MapReduce
    • Apache Flink
    • Apache Hive
    • Presto on Iceberg (Experimental)
    • Presto
    • Trino
    • Tensorflow
  • Storage Integrations
    • Amazon AWS S3
    • HDFS
    • Azure Blob Store
    • Azure Data Lake Storage Gen2
    • Azure Data Lake Storage
    • Google Cloud Storage
    • Qiniu Kodo
    • COSN
    • CephObjectStorage
    • MinIO
    • NFS
    • Aliyun Object Storage Service
    • Ozone
    • Swift
    • WEB
    • CephFS
  • Security
  • Operations
    • Configuration Settings
    • User CLI
    • Admin CLI
    • Web UI
    • Journal Management
    • Metastore Management
    • Metrics
  • Administration
    • Troubleshooting
    • Basic Logging
    • Remote Logging
    • Performance Tuning
    • Scalability Tuning
    • StressBench (Experimental)
    • Upgrading
  • Solutions
  • Client APIs
    • Java API
    • S3 API
    • REST API
    • POSIX API
  • Contributor Resources
    • Building Alluxio From Source
    • Contribution Guide
    • Code Conventions
    • Documentation Conventions
    • Contributor Tools
  • Reference
    • List Of Configuration Properties
    • List of Metrics
  • REST API
    • Master REST API
    • Worker REST API
    • Proxy REST API
    • Job REST API
  • Javadoc
Powered by GitBook
On this page
  • What is Alluxio?
  • What platforms and Java versions can Alluxio run on?
  • What license is Alluxio under?
  • Why is my analytics job not running faster after deploying Alluxio?
  • Should I deploy Alluxio as a stand-alone system or through an orchestration framework?
  • Which programming language does Alluxio support?
  • What happens if my data set does not fit in memory?
  • Does Alluxio support a high availability mode?
  • Will Alluxio rebalance cached blocks to the newly added nodes in order to balance memory space utilization?
  • Does Alluxio require HDFS?
  • How can I learn more about Alluxio?
  • Where can I report issues or propose new features?
  • Where can I get more help?
  • How can I contribute to Alluxio?
  1. Overview

FAQ

Last updated 6 months ago

What is Alluxio?

, formerly Tachyon, is an open source, memory speed, virtual distributed storage. It enables any application to interact with any data from any storage system at memory speed. Read more about Alluxio .

What platforms and Java versions can Alluxio run on?

Alluxio requires JDK 1.8 or JDK 11 to run on various distributions of Linux / MacOS.

What license is Alluxio under?

Alluxio is open sourced under the Apache 2.0 license.

Why is my analytics job not running faster after deploying Alluxio?

Some possible reasons to consider:

  1. The job is computation bound and does not spend significant time reading or writing data. Because the bottleneck is not in I/O performance, the benefit from faster Alluxio I/O is small.

  2. The persistent storage is co-located with compute (e.g. Alluxio is connected to a local HDFS) and the input data of the job is in the OS .

  3. Due to misconfiguration, clients are not able to identify their corresponding local Alluxio worker. This results in reading from remote Alluxio workers through the network, resulting in low data-locality.

  4. Input data is not loaded into Alluxio yet or already evicted, causing the job to read from the under storage instead of the Alluxio cache.

Should I deploy Alluxio as a stand-alone system or through an orchestration framework?

It is recommended to deploy Alluxio as a stand-alone system. Orchestration frameworks supported include:

Which programming language does Alluxio support?

What happens if my data set does not fit in memory?

Does Alluxio support a high availability mode?

Will Alluxio rebalance cached blocks to the newly added nodes in order to balance memory space utilization?

No, rebalancing of data blocks in Alluxio is not currently supported.

Does Alluxio require HDFS?

No, Alluxio can run on many under storage systems such as Amazon S3 or Swift in addition to HDFS.

How can I learn more about Alluxio?

Read the Alluxio book to learn Alluxio comprehensively.

Where can I report issues or propose new features?

Where can I get more help?

How can I contribute to Alluxio?

Alluxio is primarily developed in Java and exposes Java-like File APIs for other applications to interact with. Alluxio supports other language bindings (experimental currently) including and .

Alluxio can be run as a FUSE mount exposing a . This enables any program which normally accesses a local file system to access data from Alluxio without modification. This is a common way for applications written in non-Java languages or non-Hadoop APIs to access Alluxio data without needing to rewrite the application.

It is not required for the input data set to fit in Alluxio storage space in order for applications to work. Alluxio will transparently load data on demand from the under storage. To help fit more data in Alluxio's storage space, configure Alluxio to leverage other storage resources such as SSD and HDD in addition to memory to extend Alluxio storage capacity. Read more about Alluxio storage setup .

Yes. See instructions about .

Join the to chat with users and developers.

Read the recent and .

Join the meetup group for Alluxio at . Other Alluxio events can be found .

is used to track feature development and issues. To report an issue or propose a feature, post on the Github issue.

For any questions related to installation, contribution or feedback, please join our or send an email to the . We look forward to seeing you there.

Thank you for your interest in contributing. Please read .

Alluxio
here
buffer cache
Kubernetes
POSIX API
here
Deploy Alluxio on a Cluster with HA
Alluxio community Slack Channel
blogs
presentations
http://www.meetup.com/Alluxio/
here
Github Issues
Alluxio community Slack Channel
Alluxio User Mailing List
our contributor guide
Python
Golang