我有一句话
text = '''If you're in construction or need to pass fire inspection, or just want fire resistant materials for peace of mind, this is the one to use. Check out 3rd party sellers as well Skylite'''
我在它上面应用了 NLTK 分块并得到一棵树作为输出。
sentences = nltk.sent_tokenize(d)
sentences = [nltk.word_tokenize(sent) for sent in sentences]
sentences = [nltk.pos_tag(sent) for sent in sentences]
grammar = """NP: {<DT>?<JJ>*<NN.*>+}
RELATION: {<V.*>}
{<DT>?<JJ>*<NN.*>+}
ENTITY: {<NN.*>}"""
cp = nltk.RegexpParser(grammar)
for i in sentences:
result = cp.parse(i)
print(result)
print(type(result))
result.draw()
输出如下:
(S If/IN you/PRP (RELATION 're/VBP) in/IN (NP construction/NN) or/CC (NP need/NN) to/TO (RELATION pass/VB) (NP fire/NN inspection/NN) ,/, or/CC just/RB (RELATION want/VB) (NP fire/NN) (NP resistant/JJ materials/NNS) for/IN (NP peace/NN) of/IN (NP mind/NN) ,/, this/DT (RELATION is/VBZ) (NP the/DT one/NN) to/TO (RELATION use/VB) ./.)
如何获得字符串列表格式的名词短语:
[construction, need, fire inspection, fire, resistant materials, peace, mind, the one]
请给一些建议......?