Embeddings#
This tutorial shows you how to use Flair to produce embeddings for words and documents. Embeddings are vector representations that are useful for a variety of reasons. All Flair models are trained on top of embeddings, so if you want to train your own models, you should understand how embeddings work.
Example 1: Embeddings Words with Transformers#
Let’s use a standard BERT model (bert-base-uncased) to embed the sentence “the grass is green”.
Simply instantiate TransformerWordEmbeddings
and call embed()
over an example sentence:
from flair.embeddings import TransformerWordEmbeddings
from flair.data import Sentence
# init embedding
embedding = TransformerWordEmbeddings('bert-base-uncased')
# create a sentence
sentence = Sentence('The grass is green .')
# embed words in sentence
embedding.embed(sentence)
This will cause each word in the sentence to be embedded. You can iterate through the words and get each embedding like this:
# now check out the embedded tokens.
for token in sentence:
print(token)
print(token.embedding)
This will print each token as a long PyTorch vector:
Token[0]: "The"
tensor([-0.0323, -0.3904, -1.1946, 0.1296, 0.5806, ..], device='cuda:0')
Token[1]: "grass"
tensor([-0.3973, 0.2652, -0.1337, 0.4473, 1.1641, ..], device='cuda:0')
Token[2]: "is"
tensor([ 0.1374, -0.3688, -0.8292, -0.4068, 0.7717, ..], device='cuda:0')
Token[3]: "green"
tensor([-0.7722, -0.1152, 0.3661, 0.3570, 0.6573, ..], device='cuda:0')
Token[4]: "."
tensor([ 0.1441, -0.1772, -0.5911, 0.2236, -0.0497, ..], device='cuda:0')
(Output truncated for readability, actually the vectors are much longer.)
Transformer word embeddings are the most important concept in Flair. Check out more info in this dedicated chapter.
Example 2: Embeddings Documents with Transformers#
Sometimes you want to have an embedding for a whole document, not only individual words. In this case, use one of the DocumentEmbeddings classes in Flair.
Let’s again use a standard BERT model to get an embedding for the entire sentence “the grass is green”:
from flair.embeddings import TransformerDocumentEmbeddings
from flair.data import Sentence
# init embedding
embedding = TransformerDocumentEmbeddings('bert-base-uncased')
# create a sentence
sentence = Sentence('The grass is green .')
# embed words in sentence
embedding.embed(sentence)
Now, the whole sentence is embedded. Print the embedding like this:
# now check out the embedded sentence
print(sentence.embedding)
TransformerDocumentEmbeddings
are the most important concept in Flair. Check out more info in this dedicated chapter.
How to Stack Embeddings#
Flair allows you to combine embeddings into “embedding stacks”. When not fine-tuning, using combinations of embeddings often gives best results!
Use the StackedEmbeddings
class and instantiate it by passing a list of embeddings that you wish to combine. For instance, lets combine classic GloVe WordEmbeddings
with forward and backward FlairEmbeddings
.
First, instantiate the two embeddings you wish to combine:
from flair.embeddings import WordEmbeddings, FlairEmbeddings
# init standard GloVe embedding
glove_embedding = WordEmbeddings('glove')
# init Flair forward and backwards embeddings
flair_embedding_forward = FlairEmbeddings('news-forward')
flair_embedding_backward = FlairEmbeddings('news-backward')
Now instantiate the StackedEmbeddings
class and pass it a list containing these two embeddings.
from flair.embeddings import StackedEmbeddings
# create a StackedEmbedding object that combines glove and forward/backward flair embeddings
stacked_embeddings = StackedEmbeddings([
glove_embedding,
flair_embedding_forward,
flair_embedding_backward,
])
That’s it! Now just use this embedding like all the other embeddings, i.e. call the embed()
method over your sentences.
sentence = Sentence('The grass is green .')
# just embed a sentence using the StackedEmbedding as you would with any single embedding.
stacked_embeddings.embed(sentence)
# now check out the embedded tokens.
for token in sentence:
print(token)
print(token.embedding)
Words are now embedded using a concatenation of three different embeddings. This means that the resulting embedding vector is still a single PyTorch vector.