0

我已经使用了BERTwithHuggingFace和,PyTorch用于培训和评估。下面是代码:DataLoaderSerializer

! pip install transformers==3.5.1
from transformers import AutoModel, BertTokenizerFast

bert = AutoModel.from_pretrained('bert-base-uncased')
tokenizer = BertTokenizerFast.from_pretrained('bert-base-uncased')


def textToTensor(text,labels=None,paddingLength=30):
  
  tokens = tokenizer.batch_encode_plus(text.tolist(), max_length=paddingLength, padding='max_length', truncation=True)
  
  text_seq = torch.tensor(tokens['input_ids'])
  text_mask = torch.tensor(tokens['attention_mask'])

  text_y = None
  if isinstance(labels,np.ndarray): # if we do not have y values
    text_y = torch.tensor(labels.tolist())

  return text_seq, text_mask, text_y


text = test_df['text'].values

seq,mask,_ = textToTensor(text,paddingLength=35)
data = TensorDataset(seq,mask)
dataloader = DataLoader(data,batch_size=1)

for step,batch in enumerate(dataloader):
  batch = [t.to(device) for t in batch]
  sent_id, mask = batch

  with torch.no_grad():
    print(np.argmax(model(sent_id, mask).detach().cpu().numpy(),1))

结果它给了我一个numpy array结果,并且由于在这个中使用了batch_size=1和否Serializer,我得到的结果是作为类预测的单个数组编号。

我有两个问题:

结果是否严格按照 的指标df['text']

**我怎样才能得到一个句子的预测,比如你好做我的预测。我是单句

有人可以帮我做一个预测吗?

4

0 回答 0