我们可以对此进行测试。首先,让我们构建一个包含一周日期和几百万行的表:
create or replace table one_week2
as
select '2020-04-01'::date + (7*seq8()/100000000)::int day, random() data, random() data2, random() data3
from table(generator(rowcount => 100000000))
现在我们可以编写两个查询来遍历这个表:
选项1:
With A1 AS (select * from one_week2 where day='2020-04-05'),
A2 AS (select * from one_week2 where day='2020-04-05'),
A3 AS (select * from one_week2 where day='2020-04-05'),
A4 AS (select * from one_week2 where day='2020-04-05'),
A5 AS (select * from one_week2 where day='2020-04-05'),
final_merged_table as (
select * from a1
union all select * from a2
union all select * from a3
union all select * from a4
union all select * from a5)
SELECT count(*)
FROM final_merged_table
选项 2:
With A1 AS (select * from one_week2),
A2 AS (select * from one_week2),
A3 AS (select * from one_week2),
A4 AS (select * from one_week2),
A5 AS (select * from one_week2),
final_merged_table as (
select * from a1
union all select * from a2
union all select * from a3
union all select * from a4
union all select * from a5)
SELECT count(*)
FROM final_merged_table
where day='2020-04-05'
;
当我们运行这些查询时,两者的配置文件看起来相同 - 因为过滤器已被按下:
选项 1 配置文件

选项 2 配置文件

总之
您可以信任 Snowflake 优化器。
信任很重要,但也要验证:有时优化器可能会被复杂的 CTE 弄糊涂。有时,Snowflake engs 优化了优化器,今天不起作用的东西,明天可以更好地工作。
