haskell - 在 Haskell 广度优先中构建二叉树（不是 BST）

Question

我最近开始使用 Haskell，可能会持续一段时间。只是被要求使用它来更好地理解我在 Uni 上的一门课程的函数式编程。

现在我有一个小问题，我目前正在尝试做的事情。我想以广度优先构建它，但我认为我的条件搞砸了，或者我的条件也错了。

所以基本上如果我给它 [“A1-Gate”, “North-Region”, “South-Region”, “Convention Center”, “Rectorate”, “Academic Building1”, “Academic Building2”]and [0.0, 0.5, 0.7, 0.3, 0.6, 1.2, 1.4, 1.2]，我的树应该像

我试图实现的树布局

但是我的试运行结果不是我预期的哈哈。因此，Haskell 的一位特别敏锐的专家可能会帮助我发现我做错了什么。输出：

*Main> l1 = ["A1-Gate", "North-Region", "South-Region", "Convention Center", 
             "Rectorate", "Academic Building1", "Academic Building2"]
*Main> l3 = [0.0, 0.5, 0.7, 0.3, 0.6, 1.2, 1.4, 1.2]
*Main> parkingtree = createBinaryParkingTree l1 l3
*Main> parkingtree
Node "North-Region" 0.5 
   (Node "A1-Gate" 0.0 EmptyTree EmptyTree) 
   (Node "Convention Center" 0.3 
     (Node "South-Region" 0.7 EmptyTree EmptyTree) 
     (Node "Academic Building2" 1.4 
       (Node "Academic Building1" 1.2 EmptyTree EmptyTree) 
       (Node "Rectorate" 0.6 EmptyTree EmptyTree)))

A-1 Gate 应该是根，但它最终成为一个没有孩子的孩子，所以条件非常混乱。

如果我能得到一些指导，那会有所帮助。以下是我到目前为止所写的内容：

data Tree = EmptyTree | Node [Char] Float Tree Tree deriving (Show,Eq,Ord)

insertElement location cost EmptyTree = 
   Node location cost EmptyTree EmptyTree
insertElement newlocation newcost (Node location cost left right) = 
   if (left == EmptyTree && right == EmptyTree)
   then Node location cost (insertElement newlocation newcost EmptyTree) 
                           right
   else if (left == EmptyTree && right /= EmptyTree)
        then Node location cost (insertElement newlocation newcost EmptyTree) 
                                right
        else if (left /= EmptyTree && right == EmptyTree)
             then Node location cost left 
                                (insertElement newlocation newcost EmptyTree)
             else Node newlocation newcost EmptyTree
                                (Node location cost left right)

buildBPT [] = EmptyTree
--buildBPT (xs:[]) = insertElement (fst xs) (snd xs) (buildBPT [])
buildBPT (x:xs) = insertElement (fst x) (snd x) (buildBPT xs)

createBinaryParkingTree a b = buildBPT (zip a b)

感谢您提供的任何指导。是的，我已经查看了一些类似的问题，但我确实认为我的问题有所不同，但如果您认为某个帖子有一个明确的答案，这将有助于我愿意去看看它。

score 8 · Accepted Answer

这是一个核心递归解决方案。

{-#  bft(Xs,T) :- bft( Xs, [T|Q], Q).    % if you don't read Prolog, see (*) 

     bft(     [],    Nodes ,      [])  :-  maplist( =(empty), Nodes).
     bft( [X|Xs], [N|Nodes], [L,R|Q])  :-  N = node(X,L,R), 
        bft( Xs,     Nodes,       Q).
#-}

data Tree a = Empty | Node a (Tree a) (Tree a) deriving Show

bft :: [a] -> Tree a
bft xs = head nodes    -- Breadth First Tree
  where
  nodes = zipWith g (map Just xs ++ repeat Nothing)  -- values and
                                                     --   Empty leaves...
                    (pairs $ tail nodes)             -- branches...
  g (Just x) (lt,rt)  =  Node x lt rt
  g Nothing  _        =  Empty
  pairs ~(a: ~(b:c))  =  (a,b) : pairs c
{-
  nodes!!0 = g (Just (xs!!0)) (nodes!!1, nodes!!2)            .
  nodes!!1 = g (Just (xs!!1)) (nodes!!3, nodes!!4)        .       .
  nodes!!2 = g (Just (xs!!2)) (nodes!!5, nodes!!6)      .   .   .   .
  ................                                    .................
-}

nodes是结果树的所有子树的广度优先枚举。树本身是顶部子树，即此列表中的第一个子树。我们从input 中的Node每个创建 s ，当输入用尽时，我们通过使用不定数量的s 来创建 s （叶子的真实长度是，但我们不需要关心）。xxsEmptyNothingEmptylength xs + 1

而且我们根本不需要计算。

测试：

> bft [1..4]
Node 1 (Node 2 (Node 4 Empty Empty) Empty) (Node 3 Empty Empty)

> bft [1..10]
Node 1 
   (Node 2 
      (Node 4 
         (Node 8  Empty Empty) 
         (Node 9  Empty Empty)) 
      (Node 5 
         (Node 10 Empty Empty) 
         Empty)) 
   (Node 3 
      (Node 6 Empty Empty) 
      (Node 7 Empty Empty))

它是如何工作的：关键是g' 的懒惰，它不会强制lt' 和rt' 的值，而元组结构很容易由 - 本身非常懒 - 提供服务pairs。因此，当作为g. 但是，对于下一个xin ，由thisxs引用的节点成为下一个调用's result。 ltg

然后rt轮到，等等。当xs结束时，我们点击Nothings，完全停止从 ' 的输出g中提取值。pairs所以也pairs停止前进nodes，因此永远不会完成，尽管它被定义为超过那个点的无休止的Emptys 流，只是为了安全起见。

(*) Prolog 的变量被明确设置一次：它们被允许处于尚未分配的状态。Haskell(x:xs)是 Prolog 的[X | Xs].

伪代码：维护一个队列；入队“未分配的指针”；对于每个xin xs: { 将当前队列头中的指针设置到Node(x, lt, rt)where lt，rt是未分配的指针；入队lt；入队rt；弹出队列}; 将队列中剩余的所有指针设置为Empty；在队列的原始头部找到结果树，即原始第一个“未分配指针”（或“空框”而不是“未分配指针”是另一种选择）。

这个 Prolog 的“队列”当然是完全持久的：“弹出”不会改变任何数据结构，也不会改变对队列前头的任何未完成的引用——它只是将当前指针推进到队列中。所以在所有这些排队之后剩下的是构建树节点的bfs 枚举，树本身是它的头元素 - 树是它的顶部节点，两个子节点完全实例化到底部叶子枚举完成的时间。

更新： @dfeuer提出了它的简化版本，它更接近 Prolog 原版（帖子顶部评论中的那个），可以更清晰。在他的帖子中寻找更有效的代码和讨论等内容。使用 simple[]而不是 dfeuerdata IS a = a :+ IS a对子树队列使用更有效的无限流类型，它变成

bftree :: [a] -> Tree a
bftree xs = t
    where
    t : q  =  go xs q
    go []       _              =  repeat Empty
    go (x:ys) ~(l : ~(r : q))  =  Node x l r : go ys q

      ---READ-- ----READ----      ---WRITE---

{-
   xs    =  [ x  x2  x3  x4  x5  x6  x7  x8   … ]
   (t:q) =  [ t  l   r   ll  lr  rl  rr  llr  …  Empty Empty  … … ]
-}

为了比较，树的广度优先枚举的相反操作是

bflist :: Tree a -> [a]
bflist t = [x | Node x _ _ <- q]
    where
    q  =  t : go 1 q
    go 0  _                =          []
    go i (Empty      : q)  =          go (i-1) q
    go i (Node _ l r : q)  =  l : r : go (i+1) q

         -----READ------     --WRITE--

工作原理bftree：t : q是树的子树的广度优先顺序列表。go (x:ys)使用的特定调用l和r 在它们之前由随后的调用定义go，或者与另一个x更进一步的调用，ys或者go []总是返回Empty。结果t是这个列表中的第一个，树的最顶层节点，即树本身。

这个树节点列表是通过递归调用创建go的，其速度与消耗输入值列表的速度相同，但作为输入xs消耗的速度是该速度的两倍，因为每个节点都有两个子节点。go

因此，这些额外的节点也必须定义为Empty叶子。我们不在乎需要多少，只是创建一个无限的列表来满足任何需求，尽管实际的空叶子数量会比原来的多一个xs。

这实际上与几十年来计算机科学中用于数组支持树的方案相同，其中树节点以广度优先顺序放置在线性数组中。奇怪的是，在这样的设置中，两种转换都是无操作的——只有我们对相同数据的解释是什么发生了变化，我们对它的处理，我们如何与之交互/使用它。

score 7 · Accepted Answer

更新：下面的解决方案是 big-O 最优的并且（我认为）很容易理解，所以我把它留在这里以防万一有人感兴趣。然而，Will Ness 的解决方案要漂亮得多，尤其是在稍微优化后，可以预期在实践中表现得更好。更值得学习！

我现在将忽略虚假的边缘标签，只关注正在发生的事情的核心。

算法设计中的一个常见模式是有时更容易解决更普遍的问题。因此，与其尝试构建一棵树，我将研究如何用给定数量的树构建一个森林（树木列表）。我将使节点标签具有多态性，以避免考虑它们的外观；您当然可以对原始树类型使用相同的构建技术。

data Tree a = Empty | Node a (Tree a) (Tree a)

-- Built a tree from a breadth-first list
bft :: [a] -> Tree a
bft xs = case dff 1 xs of
  [] -> Empty
  [t] -> t
  _ -> error "something went wrong"

-- Build a forest of nonempty trees.
-- The given number indicates the (maximum)
-- number of trees to build.
bff :: Int -> [a] -> [Tree a]
bff _ [] = []
bff n xs = case splitAt n xs of
  (front, rear) -> combine front (bff (2 * n) rear)
  where
    combine :: [a] -> [Tree a] -> [Tree a]
    -- you write this

这是一个完整的、工业级的、最大程度的惰性实现。这是我能想出的最有效的版本，它尽可能地懒惰。一个轻微的变体不那么懒惰，但仍然适用于完全定义的无限输入；我没有尝试测试在实践中哪个会更快。

bft' :: [a] -> Tree a
bft' xs = case bff 1 xs of
  [] -> Empty
  [t] -> t
  _ -> error "whoops"

bff' :: Int -> [a] -> [Tree a]
bff' !_ [] = []
bff' n xs = combine n xs (bff (2 * n) (drop n xs))
  where
    -- The "take" portion of the splitAt in the original
    -- bff is integrated into this version of combine. That
    -- lets us avoid allocating an intermediate list we don't
    -- really need.
    combine :: Int -> [a] -> [Tree a] -> [Tree a]
    combine 0 !_ ~[] = [] -- These two lazy patterns are just documentation
    combine _k [] ~[] = []
    combine k (y : ys) ts = Node y l r : combine (k - 1) ys dropped
      where
        (l, ~(r, dropped)) = case ts of  -- This lazy pattern matters.
          [] -> (Empty, (Empty, []))
          t1 : ts' -> (t1, case ts' of
            [] -> (Empty, [])
            t2 : ts'' -> (t2, ts''))

对于不太懒惰的变体，请相应地替换(!l, ~(!r, dropped))并(!l, !r, dropped)调整 RHS。

为了真正的工业实力，森林应该使用元素严格的列表来表示：

data SL a = Cons !a (SL a) | Nil

并且上面的对(l, ~(r, dropped))都应该使用类似的类型来表示

data LSP a b = LSP !a b

这应该避免一些（相当便宜的）运行时检查。更重要的是，它可以更容易地看到事情在哪里以及没有被强迫。

score 4 · Accepted Answer

您似乎选择的方法是向后构建树：从下到上，从右到左；从列表的最后一个元素开始。这使您的buildBPT功能看起来不错，但要求您insertElement过于复杂。要以这种方式以广度优先方式构建二叉树，在前三个之后的每一步都需要一些困难的枢轴。

向树中添加 8 个节点需要以下步骤（查看节点如何从最后一个插入到第一个）：

   .              4                                                                                                                                                                                          
                6   6                                                                                                                                                                                        
   8           7 8 . .                                                                                                                                                                                       
  . .                                                                                                                                                                                                           
                  3                                                                                                                                                                                          
   7            4   5                                                                                                                                                                                        
  8 .          6 7 8 .                                                                                                                                                                                       

   6              2                                                                                                                                                                                          
  7 8           3   4                                                                                                                                                                                        
               5 6 7 8                                                                                                                                                                                       
   5                                                                                                                                                                                                         
 6   7            1                                                                                                                                                                                      
8 . . .       2       3                                                                                                                                                                                  
            4   5   6   7                                                                                                                                                                                
           8 . . . . . . .

相反，如果您从左到右、从上到下插入节点，您最终会得到一个更简单的解决方案，不需要旋转，而是需要一些树结构内省。查看插入顺序；在任何时候，现有值都保持在原来的位置：

   .              1                                                                                                                                                                                               
                2   3                                                                                                                                                                                             
   1           4 5 . .                                                                                                                                                                                            
  . .                                                                                                                                                                                                             
                  1                                                                                                                                                                                               
   1            2   3                                                                                                                                                                                             
  2 .          4 5 6 .                                                                                                                                                                                            

   1              1                                                                                                                                                                                               
  2 3           2   3                                                                                                                                                                                             
               4 5 6 7                                                                                                                                                                                            
   1                                                                                                                                                                                                              
 2   3            1                                                                                                                                                                                           
4 . . .       2       3                                                                                                                                                                                       
            4   5   6   7                                                                                                                                                                                     
           8 . . . . . . .

插入步骤具有渐近时间复杂度，其顺序O(n^2)是n要插入的节点数，因为您要一个接一个地插入节点，然后迭代树中已经存在的节点。

当我们从左到右插入时，诀窍是检查左子树是否完整：

如果是，并且右子树不完整，则向右递归。
如果是，并且右子树也完整，则向左递归（开始新行）。
如果不是，则向左递归。

这是我的（更通用的）解决方案：

data Tree a = Leaf | Node a (Tree a) (Tree a)
            deriving (Eq, Show)

main = do
    let l1 = ["A1-Gate", "North-Region", "South-Region", "Convention Center", 
              "Rectorate", "Academic Building1", "Academic Building2"]
    let l2 = [0.0, 0.5, 0.7, 0.3, 0.6, 1.2, 1.4, 1.2]
    print $ treeFromList $ zip l1 l2

mkNode :: a -> Tree a
mkNode x = Node x Leaf Leaf

insertValue :: Tree a -> a -> Tree a
insertValue Leaf y = mkNode y
insertValue (Node x left right) y
    | isComplete left && nodeCount left /= nodeCount right = Node x left (insertValue right y)
    | otherwise = Node x (insertValue left y) right
    where nodeCount Leaf = 0
          nodeCount (Node _ left right) = 1 + nodeCount left + nodeCount right
          depth Leaf = 0
          depth (Node _ left right) = 1 + max (depth left) (depth right)
          isComplete n = nodeCount n == 2 ^ (depth n) - 1

treeFromList :: (Show a) => [a] -> Tree a
treeFromList = foldl insertValue Leaf

编辑：更详细的解释：

这个想法是记住插入节点的顺序：先从左到右，然后从上到下。我在实际函数中压缩了不同的情况，但您可以将它们扩展为三种：

左侧是否完整？如果没有，则插入左侧。
右侧和左侧一样完整吗？如果没有，则插入到右侧。
两边都满了，所以我们通过插入到左边开始一个新的关卡。

因为函数从左到右和从上到下填充节点，所以我们总是知道（它是一个不变量）左侧必须在右侧之前填充，并且左侧永远不能更多比右侧深一层（也不能比右侧浅）。

通过跟踪第二组示例树的增长，您可以看到值是如何在此不变量之后插入的。这足以递归地描述该过程，因此它可以外推到任意大小的列表（递归就是魔法）。

现在，我们如何确定一棵树是否“完整”？好吧，如果它是完美平衡的，或者如果 - 在视觉上 - 它的值形成一个三角形，它就是完整的。当我们使用二叉树时，三角形的底边（填充时）必须有多个等于 2 的幂的值。更具体地说，它必须具有2^(depth-1)值。在示例中自己计算：

depth = 1 -> base = 1: 2^(1-1) = 1
depth = 2 -> base = 2: 2^(2-1) = 2
depth = 3 -> base = 4: 2^(3-1) = 4
depth = 4 -> base = 8: 2^(4-1) = 8

底面以上的节点总数比底面的宽度少一：2^(n-1) - 1. 因此，完整树中的节点总数是基数以上的节点数加上基数，所以：

num nodes in complete tree = 2^(depth-1) - 1 + 2^(depth-1)
                           = 2 × 2^(depth-1) - 1
                           = 2^depth - 1

所以现在我们可以说一棵树是完整的，如果它有完全2^depth - 1非空的节点。

因为我们从左到右，从上到下，当左边完成时，我们向右移动，当右边和左边一样完整时（意味着它有相同数量的节点，也就是因为不变量它也是完整的），那么我们知道整棵树是完整的，因此必须添加一个新行。

我最初在那里有三种特殊情况：当两个节点都为空时，当左节点为空时（因此右节点也是空的）以及当右节点为空时（因此左节点不能为空）。这三种特殊情况被最后一种带有警卫的情况所取代：

如果两边都是空的，那么countNodes left == countNodes right，因此我们添加另一行（左侧）。
如果左边是空的，那么两边都是空的（见前一点）。
如果右边是空的，那么左边一定有深度 1 和节点数 1，这意味着它是完整的，并且1 /= 0，所以我们添加到右边。

haskell - 在 Haskell 广度优先中构建二叉树（不是 BST）

3 回答 3

Related

Reference