# 缓存加载

分布式加载允许用户高效地将数据从 UFS 加载到 Alluxio 集群。\
这可用于初始化 Alluxio 集群，以便在 Alluxio 上运行工作负载时能够立即提供缓存数据。\
例如，分布式加载可用于为机器学习作业预取数据，从而加快训练过程。\
分布式加载可利用[文件分片](https://documentation.alluxio.io/ee-ai-cn/ai-3.6/cache/pages/8hnuhjhRC07I02AVyrEw#大文件分段)\
和[多副本](/ee-ai-cn/ai-3.6/data-access/high-availability/multiple-replicas.md)来加强高并发数据访问场景中的文件分发。

## 使用方法

有两种触发分布式加载的推荐方法：

### 任务加载 CLI

`任务加载`命令可用于将数据从 UFS（底层文件系统）加载到 Alluxio 集群。\
CLI 会向 Alluxio coordinator 发送加载请求，coordinator 随后会将加载操作分发到所有 worker 节点。

```shell
bin/alluxio job load [flags] <path>

# 输出示例
Progress for loading path '/path':
        Settings:       bandwidth: unlimited    verify: false
        Job State: SUCCEEDED
        Files Processed: 1000
        Bytes Loaded: 125.00MB
        Throughput: 2509.80KB/s
        Block load failure rate: 0.00%
        Files Failed: 0
```

有关 CLI 的详细用法，请参阅 [job load](/ee-ai-cn/ai-3.6/reference/user-cli.md) 文档。

### REST API

与命令行工具类似，REST API 也可用于加载数据。\
请参阅 API 参考页面以获取更多详细信息。\
请注意，任务列表结果仅包含七天内的加载任务。历史任务的保留时间可以通过配置项 `alluxio.job.retention.time` 进行设置。


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.alluxio.io/ee-ai-cn/ai-3.6/cache/cache-preloading.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
