flair.embeddings.legacy#
Warning
All embeddings in flair.embeddings.legacy are considered deprecated. there is no guarantee that they are still working and we recommend using different embeddings instead.
- class flair.embeddings.legacy.ELMoEmbeddings(model='original', options_file=None, weight_file=None, embedding_mode='all')View on GitHub#
Bases:
TokenEmbeddings
Contextual word embeddings using word-level LM, as proposed in Peters et al., 2018. ELMo word vectors can be constructed by combining layers in different ways. Default is to concatene the top 3 layers in the LM.
- name: str#
- property embedding_length: int#
Returns the length of the embedding vector.
- use_layers_all(x)View on GitHub#
- use_layers_top(x)View on GitHub#
- use_layers_average(x)View on GitHub#
- extra_repr()View on GitHub#
Set the extra representation of the module.
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
-
embeddings_name:
str
#
- training: bool#
- class flair.embeddings.legacy.CharLMEmbeddings(model, detach=True, use_cache=False, cache_directory=None)View on GitHub#
Bases:
TokenEmbeddings
Contextual string embeddings of words, as proposed in Akbik et al., 2018.
- __init__(model, detach=True, use_cache=False, cache_directory=None)View on GitHub#
Initializes contextual string embeddings using a character-level language model.
- Parameters:
model (
str
) – model string, one of ‘news-forward’, ‘news-backward’, ‘news-forward-fast’, ‘news-backward-fast’, ‘mix-forward’, ‘mix-backward’, ‘german-forward’, ‘german-backward’, ‘polish-backward’, ‘polish-forward’ depending on which character language model is desired.detach (
bool
) – if set to False, the gradient will propagate into the language model. this dramatically slows down training and often leads to worse results, so not recommended.use_cache (
bool
) – if set to False, will not write embeddings to file for later retrieval. this saves disk space but will not allow re-use of once computed embeddings that do not fit into memorycache_directory (
Optional
[Path
]) – if cache_directory is not set, the cache will be written to ~/.flair/embeddings. otherwise the cache is written to the provided directory.
Deprecated since version 0.4: Use ‘FlairEmbeddings’ instead.
- name: str#
- train(mode=True)View on GitHub#
Set the module in training mode.
This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g.
Dropout
,BatchNorm
, etc.- Parameters:
mode (bool) – whether to set training mode (
True
) or evaluation mode (False
). Default:True
.- Returns:
self
- Return type:
Module
- property embedding_length: int#
Returns the length of the embedding vector.
-
embeddings_name:
str
#
- training: bool#
- class flair.embeddings.legacy.DocumentMeanEmbeddings(token_embeddings)View on GitHub#
Bases:
DocumentEmbeddings
- __init__(token_embeddings)View on GitHub#
The constructor takes a list of embeddings to be combined.
Deprecated since version 0.3.1: The functionality of this class is moved to ‘DocumentPoolEmbeddings’
- name: str#
- property embedding_length: int#
Returns the length of the embedding vector.
- embed(sentences)View on GitHub#
Add embeddings to every sentence in the given list of sentences. If embeddings are already added, updates only if embeddings are non-static.
-
embeddings_name:
str
#
- training: bool#
- class flair.embeddings.legacy.DocumentLSTMEmbeddings(embeddings, hidden_size=128, rnn_layers=1, reproject_words=True, reproject_words_dimension=None, bidirectional=False, dropout=0.5, word_dropout=0.0, locked_dropout=0.0)View on GitHub#
Bases:
DocumentEmbeddings
- __init__(embeddings, hidden_size=128, rnn_layers=1, reproject_words=True, reproject_words_dimension=None, bidirectional=False, dropout=0.5, word_dropout=0.0, locked_dropout=0.0)View on GitHub#
The constructor takes a list of embeddings to be combined.
- Parameters:
embeddings (
List
[TokenEmbeddings
]) – a list of token embeddingshidden_size – the number of hidden states in the lstm
rnn_layers – the number of layers for the lstm
reproject_words (
bool
) – boolean value, indicating whether to reproject the token embeddings in a separate linear layer before putting them into the lstm or not.reproject_words_dimension (
Optional
[int
]) – output dimension of reprojecting token embeddings. If None the same output dimension as before will be taken.bidirectional (
bool
) – boolean value, indicating whether to use a bidirectional lstm or notdropout (
float
) – the dropout value to be usedword_dropout (
float
) – the word dropout value to be used, if 0.0 word dropout is not usedlocked_dropout (
float
) – the locked dropout value to be used, if 0.0 locked dropout is not used.
Deprecated since version 0.4: The functionality of this class is moved to ‘DocumentRNNEmbeddings’
- name: str#
-
embeddings_name:
str
#
- training: bool#
- property embedding_length: int#
Returns the length of the embedding vector.
- embed(sentences)View on GitHub#
Add embeddings to all sentences in the given list of sentences. If embeddings are already added, update only if embeddings are non-static.