python - 组合和的记忆与非记忆时间复杂度分析

Question

我试图理解为什么使用lru_cache来解决这个问题会产生更慢的代码性能。

问题本质上是返回所有加起来达到某个目标的组合。

我正在使用lru_cache装饰器进行记忆（docs），这是我的解决方案：

from functools import lru_cache

def combinationSum(candidates, target):
    return dfs(tuple(candidates), 0, target)

@lru_cache(maxsize=None)
def dfs(candidates, i, target):
    if target < 0:
        return []

    if target == 0:
        return [[]]

    if i == len(candidates):
        return []

    final_results = []
    for j in range(i, len(candidates)):

        results = dfs(candidates, j, target - candidates[j])

        for x in results:
            final_results.append([candidates[j]] + x)

    return final_results

似乎当lru_cache装饰器被注释掉时，这个算法的运行速度几乎提高了 50%。这似乎有点违反直觉，因为我认为应该降低解决方案的时间复杂度，即使增加了从记忆中检索结果的函数调用开销。

对于记忆的解决方案，我认为时间复杂度应该是数组的长度在O(n^2*k*2^n)哪里，并且是从到范围内的所有数字。nk0target

这是我的分析（需要一点帮助验证）：

time complexity 
= possible states for memoization x work done at each step
= (n * k) * (n * maximum size of results)
= n * k * n * 2^n

在如何分析递归解决方案的时间复杂度方面，我也缺少一些知识空白，我可以为此提供一些帮助！

编辑：

我range(1, 10000)用作测试输入，以下是基准：

# with lru_cache
$ time python3 combination_sum.py
CacheInfo(hits=59984, misses=49996, maxsize=None, currsize=49996)

real    0m4.031s
user    0m3.996s
sys     0m0.024s

# without lru_cache
$ time python3 combination_sum.py

real    0m0.073s
user    0m0.060s
sys     0m0.010s

score 1 · Accepted Answer

你没有给出两个论点，它们都很重要。通过选择特定的配对，我可以使任何一个版本都比另一个版本快得多。如果您传递range(1, 10000)as candidates，那么每个缓存查找都必须（除其他外）进行 9999 次比较，以确定候选者始终相同 - 这是巨大的开销。尝试，例如，

combinationSum(range(1, 1000), 45) # not ten thousand, just one thousand

对于缓存版本更快的情况。之后：

>>> dfs.cache_info()
CacheInfo(hits=930864, misses=44956, maxsize=None, currsize=44956)

如果您不考虑进行缓存查找的费用，那么“分析”是徒劳的，并且您显然正在尝试缓存查找非常昂贵的情况。Dict 查找是 expected-case O(1)，但隐藏的常数因子可以任意大，具体取决于相等测试的成本（对于涉及N-element 元组的键，建立相等至少需要N比较）。

这应该表明一个重大改进：远离candidates参数列表。它是不变的，所以真的没有必要通过它。然后缓存只需要存储快速比较(i, target)对。

编辑：实际变化

这是另一个没有传入的代码版本candidates。为了

combinationSum(range(1, 10000), 45)

在我的盒子上，它至少快了 50 倍。target还有另一个重大变化：当减少到零以下时不要进行递归调用。大量缓存条目正在记录(j, negative_integer)参数的空列表结果。在上述情况下，此更改将最终缓存大小从 449956 减少到 1036 - 并将命中数从 9444864 减少到 6853。

def combinationSum(candidates, target):

    @lru_cache(maxsize=None)
    def dfs(i, target):
        if target == 0:
            return [[]]
        assert target > 0
        if i == n:
            return []
        final_results = []
        for j in range(i, n):
            cand = candidates[j]
            if cand <= target:
                results = dfs(j, target - cand)
                for x in results:
                    final_results.append([cand] + x)
        return final_results

    n = len(candidates)
    result = dfs(0, target)
    print(dfs.cache_info())
    return result

score 0 · Accepted Answer

尝试在您的结果上运行以下命令

>>>dfs.cache_info()

你应该得到这样的结果

CacheInfo(hits=2, misses=216, maxsize=None, currsize=216)

因为你的函数参数很长，所以它们不经常与缓存值匹配，我在这里责怪目标参数，重组程序可能会大大提高命中率。

python - 组合和的记忆与非记忆时间复杂度分析

2 回答 2

编辑：实际变化

Related

Reference