neural-network - NLP ELMo model pruning input

Question

I am trying to retrieve embeddings for words based on the pretrained ELMo model available on tensorflow hub. The code I am using is modified from here: https://www.geeksforgeeks.org/overview-of-word-embedding-using-embeddings-from-language-models-elmo/

The sentence that I am inputting is
bod =" is coming up in and every project is expected to do a video due on we look forward to discussing this with you at our meeting this this time they have laid out the selection criteria for the video award s go for the top spot this time "

and these are the keywords I want embeddings for:
words=["do", "a", "video"]

embeddings = elmo([bod],
signature="default",
as_dict=True)["elmo"]
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)

this sentence is 236 characters in length. this is the picture showing that

but when I put this sentence into the ELMo model, the tensor that is returned is only contains a string of length 48

and this becomes a problem when i try to extract embeddings for keywords that are outside the 48 length limit because the indices of the keywords are shown to be outside this length:

this is the code I used to get the indices for the words in 'bod'(as shown above)

num_list=[]
for item in words:
  print(item)
  index = bod.index(item)
  num_list.append(index)
num_list

But i keep running into this error:

I tried looking for ELMo documentation to explain why this is happening but I have not found anything related to this problem of pruned input.

Any advice is much appreciated!

Thank You

score 0 · Accepted Answer

这并不是真正的 AllenNLP 问题，因为您使用的是基于 tensorflow 的 ELMo 实现。

也就是说，我认为问题在于 ELMo 嵌入了令牌，而不是字符。您将获得 48 个嵌入，因为该字符串有 48 个标记。

neural-network - NLP ELMo model pruning input

1 回答 1

Related

Reference