rust - 为什么 Rayon 不需要 Arc<_>？

Question

在Programming Rust的第 465 页上，您可以找到代码和解释（重点由我添加）

use std::sync::Arc;

fn process_files_in_parallel(filenames: Vec<String>,
                             glossary: Arc<GigabyteMap>)
    -> io::Result<()>
{
    ...
    for worklist in worklists {
        // This call to .clone() only clones the Arc and bumps the
        // reference count. It does not clone the GigabyteMap.
        let glossary_for_child = glossary.clone();
        thread_handles.push(
            spawn(move || process_files(worklist, &glossary_for_child))
        );
    }
    ...
}
我们更改了词汇表的类型：要并行运行分析，调用者必须传入一个，一个指向已移动到堆中的Arc<GigabyteMap>a 的智能指针，方法是。当我们调用glossary.clone() 时，我们正在制作智能指针的副本，而不是整个. 这相当于增加引用计数。通过此更改，程序可以编译并运行，因为它不再依赖于引用生命周期。只要任何线程拥有一个，它就会使地图保持活动状态，即使父线程提前退出。不会有任何数据竞争，因为 an 中的数据是不可变的。GigabyteMapArc::new(giga_map)ArcGigabyteMapArc<GigabyteMap>Arc

在下一节中，他们展示了用 Rayon 重写的内容，

extern crate rayon;

use rayon::prelude::*;

fn process_files_in_parallel(filenames: Vec<String>, glossary: &GigabyteMap)
    -> io::Result<()>
{
    filenames.par_iter()
        .map(|filename| process_file(filename, glossary))
        .reduce_with(|r1, r2| {
            if r1.is_err() { r1 } else { r2 }
        })
        .unwrap_or(Ok(()))
}

您可以在重写为使用 Rayon 的部分中看到它接受&GigabyteMap而不是Arc<GigabyteMap>. 他们没有解释这是如何工作的。为什么人造丝不需要Arc<GigabyteMap>？Rayon 是如何接受直接推荐的？

score 2 · Accepted Answer

Rayon 可以保证迭代器不会超过当前堆栈帧，这与我thread::spawn在第一个代码示例中的假设不同。具体来说，par_iter在底层使用了类似 Rayon 的scope函数，它允许生成一个“附加”到堆栈并在堆栈结束之前加入的工作单元。

因为 Rayon 可以保证（通过生命周期，从用户的角度来看）任务/线程在函数调用par_iter退出之前加入，它可以提供比标准库更符合人体工程学的 API thread::spawn。

Rayon 在scope函数的文档中对此进行了扩展。

rust - 为什么 Rayon 不需要 Arc<_>？

1 回答 1

Related

Reference