python - python.NLTK (WindowDiff and PK) vs python.Segeval (WindowDiff and PK)

Question

Python NLTK implementation of Beeferman's PK and WindowDIFF are getting complete different results from python segeval implementation of both.

Using the same parameters.

hyp: 0100100000
ref: 0101000000
k=2
PK's SegEval:0.2222222
PK's NLTK:0.111111111

hyp: 111111
ref: 100100
k=2
PK's SegEval:0.4
PK's NLTK:0.64

This could lead different research results for who use it.
Why I am getting different results with PK in these 2 Implementations? PK has to have just one result.

score 2 · Accepted Answer

可能是您调用 NLTK 函数的方式出了问题，或者您使用的是旧版本的 NLTK。

对于 NLTK，我得到的结果与您在 segeval 中显示的结果相同：

>>> from nltk.metrics.segmentation import pk
>>> hyp = '0100100000'
>>> ref = '0101000000'
>>> pk(hyp, ref, 2)
0.2222222222222222
>>> hyp = '111111'
>>> ref = '100100'
>>> pk(hyp, ref, 2)
0.4

我的 nltk 版本：

>>> nltk.__version__
'3.0.5'

做这个：

$ pip install -U nltk

python - python.NLTK (WindowDiff and PK) vs python.Segeval (WindowDiff and PK)

1 回答 1

Related

Reference