Skip to main content

Embeddings

This tutorial shows you how to use Flair to produce embeddings for words and documents. Embeddings are vector representations that are useful for a variety of reasons. All Flair models are trained on top of embeddings, so if you want to train your own models, you should understand how embeddings work.

Example 1: Embeddings Words with Transformers

Let's use a standard BERT model (bert-base-uncased) to embed the sentence "the grass is green".

Simply instantate TransformerWordEmbeddings and call embed() over an example sentence:

from flair.embeddings import TransformerWordEmbeddings
from flair.data import Sentence

# init embedding
embedding = TransformerWordEmbeddings('bert-base-uncased')

# create a sentence
sentence = Sentence('The grass is green .')

# embed words in sentence
embedding.embed(sentence)

This will cause each word in the sentence to be embedded. You can iterate through the words and get each embedding like this:

# now check out the embedded tokens.
for token in sentence:
print(token)
print(token.embedding)

This will print each token as a long PyTorch vector:

Token[0]: "The"
tensor([-0.0323, -0.3904, -1.1946, 0.1296, 0.5806, ..], device='cuda:0')
Token[1]: "grass"
tensor([-0.3973, 0.2652, -0.1337, 0.4473, 1.1641, ..], device='cuda:0')
Token[2]: "is"
tensor([ 0.1374, -0.3688, -0.8292, -0.4068, 0.7717, ..], device='cuda:0')
Token[3]: "green"
tensor([-0.7722, -0.1152, 0.3661, 0.3570, 0.6573, ..], device='cuda:0')
Token[4]: "."
tensor([ 0.1441, -0.1772, -0.5911, 0.2236, -0.0497, ..], device='cuda:0')

(Output truncated for readability, actually the vectors are much longer.)

Transformer word embeddings are the most important concept in Flair. Check out more info in this dedicated chapter.

Example 2: Embeddings Documents with Transformers

Sometimes you want to have an embedding for a whole document, not only individual words. In this case, use one of the DocumentEmbeddings classes in Flair.

Let's again use a standard BERT model to get an embedding for the entire sentence "the grass is green":

from flair.embeddings import TransformerDocumentEmbeddings
from flair.data import Sentence

# init embedding
embedding = TransformerDocumentEmbeddings('bert-base-uncased')

# create a sentence
sentence = Sentence('The grass is green .')

# embed words in sentence
embedding.embed(sentence)

Now, the whole sentence is embedded. Print the embedding like this:

# now check out the embedded sentence
print(sentence.embedding)

Transformer document embeddings are the most important concept in Flair. Check out more info in this dedicated chapter.

How to Stack Embeddings

Flair allows you to combine embeddings into "embedding stacks". When not fine-tuning, using combinations of embeddings often gives best results!

Use the StackedEmbeddings class and instantiate it by passing a list of embeddings that you wish to combine. For instance, lets combine classic GloVe embeddings with forward and backward Flair embeddings.

First, instantiate the two embeddings you wish to combine:

from flair.embeddings import WordEmbeddings, FlairEmbeddings

# init standard GloVe embedding
glove_embedding = WordEmbeddings('glove')

# init Flair forward and backwards embeddings
flair_embedding_forward = FlairEmbeddings('news-forward')
flair_embedding_backward = FlairEmbeddings('news-backward')

Now instantiate the StackedEmbeddings class and pass it a list containing these two embeddings.

from flair.embeddings import StackedEmbeddings

# create a StackedEmbedding object that combines glove and forward/backward flair embeddings
stacked_embeddings = StackedEmbeddings([
glove_embedding,
flair_embedding_forward,
flair_embedding_backward,
])

That's it! Now just use this embedding like all the other embeddings, i.e. call the embed() method over your sentences.

sentence = Sentence('The grass is green .')

# just embed a sentence using the StackedEmbedding as you would with any single embedding.
stacked_embeddings.embed(sentence)

# now check out the embedded tokens.
for token in sentence:
print(token)
print(token.embedding)

Words are now embedded using a concatenation of three different embeddings. This means that the resulting embedding vector is still a single PyTorch vector.