# What is Alluxio?

Alluxio is a distributed data orchestration system that brings your data closer to your compute frameworks. It acts as a caching layer between your persistent storage (like Amazon S3, HDFS, or Azure Blob Storage) and your computation frameworks (like Spark, Presto, and PyTorch).

By caching frequently accessed data on the compute cluster, Alluxio dramatically speeds up data access, reduces network congestion, and eliminates I/O bottlenecks, which is especially critical for data-intensive applications like AI/ML training and large-scale data analytics.

<figure><img src="https://903014663-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F7DicXrir64osDa951OPc%2Fuploads%2Fgit-blob-2c43435a3d0305387141f325d45c42076041508c%2Falluxio-overview.png?alt=media" alt=""><figcaption></figcaption></figure>

### Why Use Alluxio?

You should consider using Alluxio if you are experiencing any of the following challenges:

* **Slow AI/ML Training:** Your expensive GPUs are often idle, waiting for data to be fetched from slow object stores, leading to long training times and high costs.
* **Slow Cold Start of Deploying Models:** When deploying new models for inference, the initial requests are slow because the model must be downloaded from a remote object store. This "cold start" problem leads to poor user experience and can be a bottleneck for autoscaling.
* **Data Silos:** Your data is spread across multiple data centers or cloud providers, and you need a unified way to access it without complex data migration.
* **High Egress Costs:** You are paying high fees to your cloud provider for repeatedly reading the same data from object storage.

Alluxio solves these problems by:

* **Accelerating Performance:** By caching data, Alluxio can improve I/O performance by over 10x for both model training and deployment.
* **Providing Seamless Data Access:** Alluxio provides standard APIs like POSIX (FUSE), S3, and FSSpec, allowing your applications to connect to your data without any code changes.
* **Enabling High Scalability:** The distributed architecture can scale to handle billions of objects and thousands of clients.
* **Reducing Costs:** By reducing data egress and eliminating the need for specialized, high-performance storage hardware, Alluxio helps lower your total cost of ownership.

### Next Steps

* **Learn how it works:** Dive deeper into the architecture in [How Alluxio Works](https://documentation.alluxio.io/ee-ai-en/how-alluxio-works).
* **Install Alluxio:** Ready to deploy? See the [Get Started Guide](https://documentation.alluxio.io/ee-ai-en/start).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.alluxio.io/ee-ai-en/what-is-alluxio.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
