1

我是 Hive 正则表达式匹配的新手,正在努力寻找匹配单词边界的正确模式:

haystack RLIKE concat('(?i)\b', 'needle', '\b')

不返回任何东西。

我在数据库中的示例值:

haystack
---------
needless to say
this is a needle
so many (needle)
these are needles

当我使用haystack RLIKE concat('(?i)', 'needle')时,它会返回所有行,但我实际上正在寻找this is a needle

4

1 回答 1

1

在 Hive 中使用两个反斜杠:\\b

演示:

with mytable as (
select stack(4,
'needless to say',
'this is a needle',
'so many (needle)',
'these are needles'
) as haystack
)

select haystack, haystack rlike concat('(?i)\\b', 'needle', '\\b') from mytable;

结果:

haystack             _c1
needless to say      false
this is a needle     true
so many (needle)     true
these are needles    false

请注意,这so many (needle)也是匹配的,因为(and)不是单词字符。

于 2021-03-02T10:16:47.040 回答