我正在关注本教程,该教程使用带有拥抱脸库的 BERT 编写情绪分析分类器,但我的行为非常奇怪。当使用示例文本尝试 BERT 模型时,我得到一个字符串而不是隐藏状态。这是我正在使用的代码:
import transformers
from transformers import BertModel, BertTokenizer
print(transformers.__version__)
PRE_TRAINED_MODEL_NAME = 'bert-base-cased'
PATH_OF_CACHE = "/home/mwon/data-mwon/paperChega/src_classificador/data/hugingface"
tokenizer = BertTokenizer.from_pretrained(PRE_TRAINED_MODEL_NAME,cache_dir = PATH_OF_CACHE)
sample_txt = 'When was I last outside? I am stuck at home for 2 weeks.'
encoding_sample = tokenizer.encode_plus(
sample_txt,
max_length=32,
add_special_tokens=True, # Add '[CLS]' and '[SEP]'
return_token_type_ids=False,
padding=True,
truncation = True,
return_attention_mask=True,
return_tensors='pt', # Return PyTorch tensors
)
bert_model = BertModel.from_pretrained(PRE_TRAINED_MODEL_NAME,cache_dir = PATH_OF_CACHE)
last_hidden_state, pooled_output = bert_model(
encoding_sample['input_ids'],
encoding_sample['attention_mask']
)
print([last_hidden_state,pooled_output])
输出:
4.0.0
['last_hidden_state', 'pooler_output']