Skip to content

Instantly share code, notes, and snippets.

@nvg
Created September 8, 2022 00:45
Show Gist options
  • Select an option

  • Save nvg/6edcf010d8eccb0b772881754ae2a521 to your computer and use it in GitHub Desktop.

Select an option

Save nvg/6edcf010d8eccb0b772881754ae2a521 to your computer and use it in GitHub Desktop.
Sentiment analysis with transformers
from transformers import BertForSequenceClassification, BertTokenizer
## Step 1 - Pre-processing
MODEL = 'YOUR MODEL'
TEXT = ("YOUR INPUTS")
tokenizer = BertTokenizer.from_pretrained(MODEL)
tokens = tokenizer.encode_plus(TEXT, max_length=512, # max number of tokens in each sample
truncation=True, # what to do with extra token over max_length
padding='max_length', # for shorter sequences, pad with 0's
add_special_tokens=True, # add special tokens by default
return_tensors='pt') # return TensorFlow tensors (tf)/PyTorch (pt)/Numpy (np)
# from the tokens we need:
# input_ids - token ID representations and
# attention_mask - tells which words to calcuate attention for
## Step 2 - Feed into the model
model = BertForSequenceClassification.from_pretrained(MODEL)
activations = model(**tokens) # spread the keyword arguments
## Step 3 - Get sentiment
import tensorflow as tf
probabilities = tf.nn.softmax(activations[0].detach().numpy()) # convert activations first
predictions = tf.math.argmax(probabilities, axis=1) # pick the max
predictions.numpy()
# alternatively if using PyTorch
import torch
probs = torch.nn.functional.softmax(activations[0], dim=-1)
pred = torch.argmax(probs)
pred.item()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment