clojure - 编写一个尽可能延迟的类似展开器的函数来生成任意分解

Question

问题表述

通俗地说，我想编写一个函数，它以生成二进制分解和一个元素（通常是中性）的函数作为输入，创建一个任意长度的分解生成器。更具体地说，让我们首先nfoldr在 Clojure 中定义函数。

(defn nfoldr [f e]
  (fn rec [n]
    (fn [s]
      (if (zero? n)
        (if (empty? s) e)
        (if (seq s)
          (if-some [x ((rec (dec n)) (rest s))]
            (f (list (first s) x))))))))

这里nil使用的意思是“未定义的输出，输入不在函数的域中”。此外，让我们将函数的逆关系f视为定义的集值函数inv(f)(y) = {x | f(x) = y}。

我想定义一个函数nunfoldr，这样inv(nfoldr(f , e)(n)) = nunfoldr(inv(f) , e)(n)当每个元素y inv(f)(y)都是有限的时，对于每个二元函数f、元素e和自然数n。

此外，我希望在二维的懒惰意义上尽可能懒惰地生成分解。我的目标是，当第一次获得分解的某些部分时，下一部分或下一个分解所需的（很多）计算不会发生。类似地，当第一次获得一个因式分解时，下一个分解不需要进行计算，而之前的所有分解都完全实现了。

在另一种形式中，我们可以使用以下较长版本的nfoldr，这相当于较短e的是中性元素。

(defn nfoldr [f e]
  (fn [n]
    (fn [s]
      (if (zero? n)
        (if (empty? s) e)
        ((fn rec [n]
           (fn [s]
             (if (= 1 n)
               (if (and (seq s) (empty? (rest s))) (first s))
               (if (seq s)
                 (if-some [x ((rec (dec n)) (rest s))]
                   (f (list (first s) x)))))))
         n)))))

一个特例

这个问题是该问题中描述的生成分区问题的概括。让我们看看如何将旧问题简化为当前问题。我们有每个自然数n：

npt(n) = inv(nconcat(n)) = inv(nfoldr(concat2 , ())(n)) = nunfoldr(inv(concat2) , ())(n) = nunfoldr(pt2 , ())(n)

在哪里：

npt(n)生成n-ary 分区
nconcat(n)计算n-ary 连接
concat2计算二进制连接
pt2生成二进制分区

因此，以下定义给出了该问题的解决方案。

(defn generate [step start]
  (fn [x] (take-while some? (iterate step (start x)))))

(defn pt2-step [[x y]]
  (if (seq y) (list (concat x (list (first y))) (rest y))))

(def pt2-start (partial list ()))

(def pt2 (generate pt2-step pt2-start))

(def npt (nunfoldr pt2 ()))

score 0 · Accepted Answer

我将总结我解决这个问题的故事，使用旧的创建示例运行，并以一些观察和扩展建议作为结尾。

解决方案 0

起初，我改进/概括了我为解决旧问题所采取的方法。在这里，我编写了我自己的版本concatandmap主要是为了更好地展示，在的情况下concat，为了增加一些懒惰。当然，我们也可以使用 Clojure 的版本或mapcat代替。

(defn fproduct [f]
  (fn [s]
    (lazy-seq
      (if (and (seq f) (seq s))
        (cons
          ((first f) (first s))
          ((fproduct (rest f)) (rest s)))))))

(defn concat' [s]
  (lazy-seq
    (if (seq s)
      (if-let [x (seq (first s))]
        (cons (first x) (concat' (cons (rest x) (rest s))))
        (concat' (rest s))))))

(defn map' [f]
  (fn rec [s]
    (lazy-seq
      (if (seq s)
        (cons (f (first s)) (rec (rest s)))))))

(defn nunfoldr [f e]
  (fn rec [n]
    (fn [x]
      (if (zero? n)
        (if (= e x) (list ()) ())
        ((comp
           concat'
           (map' (comp
                   (partial apply map)
                   (fproduct (list
                               (partial partial cons)
                               (rec (dec n))))))
           f)
         x)))))

为了获得内在的懒惰，我们可以(partial partial cons)用类似的东西来代替(comp (partial partial concat) list)。尽管通过这种方式我们得到了 inner LazySeqs，但我们并没有获得任何有效的惰性，因为在consing 之前，已经发生了完全实现该rest部分所需的大部分计算，这在这种通用方法中似乎是不可避免的。基于更长的版本nfoldr，我们还可以定义如下更快的版本。

(defn nunfoldr [f e]
  (fn [n]
    (fn [x]
      (if (zero? n)
        (if (= e x) (list ()) ())
        (((fn rec [n]
            (fn [x] (println \< x \>)
              (if (= 1 n)
                (list (list x))
                ((comp
                   concat'
                   (map' (comp
                           (partial apply map)
                           (fproduct (list
                                       (partial partial cons)
                                       (rec (dec n))))))
                   f)
                 x))))
          n)
         x)))))

在这里，我println在主递归函数中添加了一个调用，以获得一些渴望的可视化。因此，让我们展示外在的懒惰和内在的渴望。

user=> (first ((npt 5) (range 3)))
< (0 1 2) >
< (0 1 2) >
< (0 1 2) >
< (0 1 2) >
< (0 1 2) >
(() () () () (0 1 2))
user=> (ffirst ((npt 5) (range 3)))
< (0 1 2) >
< (0 1 2) >
< (0 1 2) >
< (0 1 2) >
< (0 1 2) >
()

解决方案 1

然后我想到了一个更有前途的方法，使用该函数：

(defn transpose [s]
  (lazy-seq
    (if (every? seq s)
      (cons
        (map first s)
        (transpose (map rest s))))))

为了获得新的解决方案，我们将map'调用中的前一个参数替换为：

(comp
  (partial map (partial apply cons))
  transpose
  (fproduct (list
              repeat
              (rec (dec n)))))

试图获得我们可以替换(partial apply cons)的内在懒惰，#(cons (first %) (lazy-seq (second %)))但这还不够。问题在于(every? seq s)内部的测试transpose，其中检查延迟分解序列是否为空（作为停止条件）导致实现它。

解决方案 2

解决我想到的前一个问题的第一种方法是使用一些关于n元素的 -ary 因式分解数量的额外知识。这样我们可以repeat多次使用这个序列作为的停止条件transpose。所以我们将里面的测试替换为transpose，添加(seq (first s))一个输入并将调用中的参数替换为：countnunfoldrmap'

(comp
  (partial map #(cons (first %) (lazy-seq (second %))))
  transpose
  (fproduct (list
              (partial apply repeat)
              (rec (dec n))))
  (fn [[x y]] (list (list ((count (dec n)) y) x) y)))

让我们转向分区问题并定义：

(defn npt-count [n]
   (comp
     (partial apply *)
     #(map % (range 1 n))
     (partial comp inc)
     (partial partial /)
     count))

(def npt (nunfoldr pt2 () npt-count))

现在我们可以证明外在和内在的懒惰。

user=> (first ((npt 5) (range 3)))
< (0 1 2) >
(< (0 1 2) >
() < (0 1 2) >
() < (0 1 2) >
() < (0 1 2) >
() (0 1 2))
user=> (ffirst ((npt 5) (range 3)))
< (0 1 2) >
()

然而，对额外知识的依赖和额外的计算成本使得这种解决方案无法接受。

解决方案 3

最后，我认为在一些关键的地方我应该使用一种“带有非惰性结束”的惰性序列，以便能够在不知不觉中检查空虚。一个这样的空序列只是一个非惰性空列表，总体而言，它们的行为有点像lazy-consClojure 早期的 s。使用下面给出的定义，我们可以得到一个可接受的解决方案，该解决方案是在以下假设下工作concat'的因式分解，我们使用的是较长版本的nunfoldr.

(def lazy? (partial instance? clojure.lang.IPending))
(defn empty-eager? [x] (and (not (lazy? x)) (empty? x)))

(defn transpose [s]
  (lazy-seq
    (if-not (some empty-eager? s)
      (cons
        (map first s)
        (transpose (map rest s))))))

(defn concat' [s]
  (if-not (empty-eager? s)
    (lazy-seq
      (if-let [x (seq (first s))]
        (cons (first x) (concat' (cons (rest x) (rest s))))
        (concat' (rest s))))
    ()))

(defn map' [f]
  (fn rec [s]
    (if-not (empty-eager? s)
      (lazy-seq (cons (f (first s)) (rec (rest s))))
      ())))

请注意，在这种方法中，输入函数f应该生成新类型的惰性序列，并且生成的n-ary 分解器也将生成这样的序列。为了处理新的输入需求，对于分区问题，我们定义：

(defn pt2 [s]
  (lazy-seq
    (let [start (list () s)]
      (cons
        start
        ((fn rec [[x y]]
           (if (seq y)
             (lazy-seq
               (let [step (list (concat x (list (first y))) (rest y))]
                 (cons step (rec step))))
             ()))
         start)))))

再一次，让我们展示外在和内在的懒惰。

user=> (first ((npt 5) (range 3)))
< (0 1 2) >
< (0 1 2) >
(< (0 1 2) >
() < (0 1 2) >
() < (0 1 2) >
() () (0 1 2))
user=> (ffirst ((npt 5) (range 3)))
< (0 1 2) >
< (0 1 2) >
()

为了使输入和输出使用标准的惰性序列（牺牲一点惰性），我们可以添加：

(defn lazy-end->eager-end [s]
  (if (seq s)
    (lazy-seq (cons (first s) (lazy-end->eager-end (rest s))))
    ()))

(defn eager-end->lazy-end [s]
  (lazy-seq
    (if-not (empty-eager? s)
      (cons (first s) (eager-end->lazy-end (rest s))))))

(def nunfoldr
  (comp
    (partial comp (partial comp eager-end->lazy-end))
    (partial apply nunfoldr)
    (fproduct (list
                (partial comp lazy-end->eager-end)
                identity))
    list))

观察和扩展

在创建解决方案 3 时，我观察到 Clojure 中用于惰性序列的旧机制可能不一定不如当前机制。随着过渡，我们获得了创建惰性序列而无需进行任何大量计算的能力，但失去了检查空虚的能力，而无需进行获取更多元素所需的计算。因为这两种能力在某些情况下都很重要，如果引入一种新机制会很好，它可以结合以前的优点。这样的机制可以再次使用外部LazySeqthunk，当被强制时将返回一个空列表或一个Cons或另一个LazySeq或一个新的LazyConsthunk。这个新的 thunk 在强制时会返回 aCons或者可能是 another LazyCons。现在empty?只会强制LazySeqthunks whilefirst并且rest还会强制LazyCons. 在此设置中map可能如下所示：

(defn map [f s]
   (lazy-seq
     (if (empty? s) ()
       (lazy-cons
         (cons (f (first s)) (map f (rest s)))))))

我还注意到，从解决方案 1 开始采用的方法有助于进一步推广。如果在调用的参数内部，我们用、笛卡尔积的一些实现和另一个递归调用替换map'更长的时间，那么我们可以创建“在不同位置拆分”的版本。例如，使用以下作为参数，我们可以定义一个“在中间分裂”并对应于一个易于想象的函数。请注意，所有“拆分策略”在关联时都是等效的。nunfoldrconsconcattransposerepeatnunfoldmnfoldmf

(comp
  (partial map (partial apply concat))
  cproduct
  (fproduct (let [n-half (quot n 2)]
              (list (rec n-half) (rec (- n n-half))))))

另一种自然修改将允许无限分解。为了实现这一点，如果f生成了无限分解，nunfoldr(f , e)(n)应该以 ω 类型的顺序生成分解，以便它们中的每一个都可以在有限时间内生成。

其他可能的扩展包括删除n参数、创建关系折叠（与我们在这里考虑的关系展开相对应）和一般处理除序列以外的代数数据结构作为输入/输出。这本书，我刚刚发现，似乎包含有价值的相关信息，以分类/关系语言给出。

最后，为了能够更方便地进行这种编程，我们可以将其转换为无点代数设置。这将需要构建相当多的“额外机器”，实际上几乎是制造一种新语言。本文演示了这种语言。