1

我想在创建数据库视图方面获得一些帮助。我的数据库架构如下所示:

products            (id, ignored_comments_ids (array))
activities          (id)
comments            (id)
activities_comments (activity_id comment_id)
products_comments   (product_id, comment_id)
offers              (product_id, activity_id)

现在我需要使用名为的自定义列创建所有产品评论的视图source

  • source= 'OFFER': 来自 products.offers.activities.comments协会的评论
  • source= 'DIRECT':来自products.comments关联的评论

    此外,该视图应排除来自products.ignored_comments_ids

我怎么做?视图必须具有product_id,source以及表中的所有列comments

我想出了以下观点,我该如何改进它?

CREATE OR REPLACE VIEW all_comments AS
  WITH the_comments AS (
    SELECT
      comments.*,
      'OFFER'     AS source,
      products.id AS product_id
    FROM comments
    JOIN activities_comments ON activities_comments.comment_id = comments.id
    JOIN activities          ON activities.id = activities_comments.activity_id
    JOIN offers              ON offers.activity_id = activities.id
    JOIN products            ON products.id = offers.product_id
  UNION
    SELECT
      comments.*,
      'DIRECT'    AS source,
      products.id AS product_id
    FROM comments
    JOIN products_comments ON products_comments.comment_id = comments.id
    JOIN products          ON products.id = products_comments.product_id
  )
  SELECT DISTINCT ON (the_comments.id)
    the_comments.id,
    the_comments.name,
    the_comments.source,
    the_comments.product_id
  FROM the_comments
  JOIN products ON products.id = the_comments.product_id
  WHERE NOT to_json(products.ignored_comment_ids)::jsonb @> the_comments.id::jsonb
  ORDER BY the_comments.id;
4

1 回答 1

0

UNION可用于合并 2 组数据,AND,同时它会删除重复的行。UNION ALL 可用于合并 2 组数据(然后停止)。因此UNION ALL 避免了搜索和删除重复行的开销,因此速度更快。

在您的初始公用表表达式 (cte)the_comments中,您强制联合的每一侧使用不同的常量,例如

select *
from (
    select 1 as id, 'OFFER' AS source
    union
    select 1 as id, 'DIRECT' AS source
    ) d
;
result:
  id   source  
 ---- -------- 
   1   DIRECT  
   1   OFFER 

即使 id 1 在该联合的两侧,由于不同的常量,该示例查询会返回 2 行。所以请UNION ALL改用。

尽管它很方便,select *但不应该在视图中使用(尽管两种方式都有争论,例如here)。也许这样做是为了简化问题,但我希望它不会按字面意思使用。如果视图的目的是只返回 4 列,则只指定这些列。

尽管您在输出中需要 product_id ,但这可以来自offers.product_id或者products_comments.product_id您实际上不需要加入到 products 表中。在 cte 之后也不需要加入 products 表。

因为我们现在正在使用UNION ALL我看不到使用的任何好处SELECT DISTINCT ON(...),我怀疑这只是可以删除的开销。显然我无法验证这一点,它可能完全取决于您的功能要求。另请注意,这SELECT DISTINCT ON(...)将删除source您精心介绍的 a,例如

select distinct on (id) id, source
from (
    select 1 as id, 'OFFER' AS source
    union
    select 1 as id, 'DIRECT'AS source
    ) d
;
result:
  id   source  
 ---- -------- 
   1   DIRECT   

不要order by在任何视图中包含子句,只订购“最终查询”。换句话说; 如果您创建一个视图,那么您可能会在其他几个查询中使用它。这些查询中的每一个都可能有自己的 where 子句,并且需要不同的结果顺序。如果您订购视图,您只是在消耗 cpu 周期,然后在以后放弃这些工作。所以,请删除 order by 子句。

我非常想为最终的 where 子句提出一种不同的方法,但由于我不太处理 JSON,我没有足够的经验来提出替代方案。然而,在 where 子句中对数据使用函数几乎总是导致性能不佳的原因,最明显的是因为它通常会删除对这些函数所涉及的列上的索引的访问。找到一种更有效的方法来排除注释异常可能会对查询的性能产生最大的改进。

所以,我的建议会导致这个:

WITH the_comments
AS (
    SELECT
        comments.id
      , comments.name
      , 'OFFER' AS source
      , offers.product_id AS product_id
    FROM comments
    JOIN activities_comments ON activities_comments.comment_id = comments.id
    JOIN activities ON activities.id = activities_comments.activity_id
    JOIN offers ON offers.activity_id = activities.id
    UNION ALL
    SELECT
        comments.id
      , comments.name
      , 'DIRECT' AS source
      , products_comments.product_id AS product_id
    FROM comments
    JOIN products_comments ON products_comments.comment_id = comments.id
    )
SELECT
    the_comments.id
  , the_comments.name
  , the_comments.source
  , the_comments.product_id
FROM the_comments
/* perhaps raise a separate question on this bit */
WHERE NOT to_json(products.ignored_comment_ids)::jsonb @> the_comments.id::jsonb
于 2018-10-10T23:39:09.560 回答