

Classifier(*args, **kwargs)

Abstract base class for all Flair models that do classification.

DeepNCMDecoder(label_dictionary, embeddings_size)

Deep Nearest Class Mean (DeepNCM) Classifier for text classification tasks.

DefaultClassifier(embeddings, ...[, ...])

Default base class for all Flair models that do classification.

LabelVerbalizerDecoder(label_embedding, ...)

A class for decoding labels using the idea of siamese networks / bi-encoders.

LockedDropout([dropout_rate, batch_first, ...])

Implementation of locked (or variational) dropout.

Model(*args, **kwargs)

Abstract base class for all downstream task models in Flair, such as SequenceTagger and TextClassifier.

PrototypicalDecoder(num_prototypes, ...[, ...])

WordDropout([dropout_rate, inplace])

Implementation of word dropout.

class flair.nn.Classifier(*args, **kwargs)View on GitHub#

Bases: Model[DT], Generic[DT], ReduceTransformerVocabMixin, ABC

Abstract base class for all Flair models that do classification.

The classifier inherits from flair.nn.Model and adds unified functionality for both, single- and multi-label classification and evaluation. Therefore, it is ensured to have a fair comparison between multiple classifiers.

evaluate(data_points, gold_label_type, out_path=None, embedding_storage_mode='none', mini_batch_size=32, main_evaluation_metric=('micro avg', 'f1-score'), exclude_labels=None, gold_label_dictionary=None, return_loss=True, **kwargs)View on GitHub#

Evaluates the model. Returns a Result object containing evaluation results and a loss value.

Implement this to enable evaluation.

  • data_points (Union[list[TypeVar(DT, bound= DataPoint)], Dataset]) – The labeled data_points to evaluate.

  • gold_label_type (str) – The label type indicating the gold labels

  • out_path (Union[str, Path, None]) – Optional output path to store predictions.

  • embedding_storage_mode (Literal['none', 'cpu', 'gpu']) – One of ‘none’, ‘cpu’ or ‘gpu’. ‘none’ means all embeddings are deleted and freshly recomputed, ‘cpu’ means all embeddings are stored on CPU, or ‘gpu’ means all embeddings are stored on GPU

  • mini_batch_size (int) – The batch_size to use for predictions.

  • main_evaluation_metric (tuple[str, str]) – Specify which metric to highlight as main_score.

  • exclude_labels (Optional[list[str]]) – Specify classes that won’t be considered in evaluation.

  • gold_label_dictionary (Optional[Dictionary]) – Specify which classes should be considered, all other classes will be taken as <unk>.

  • return_loss (bool) – Weather to additionally compute the loss on the data-points.

  • **kwargs – Arguments that will be ignored.

Return type:



The evaluation results.

abstract predict(sentences, mini_batch_size=32, return_probabilities_for_all_classes=False, verbose=False, label_name=None, return_loss=False, embedding_storage_mode='none')View on GitHub#

Uses the model to predict labels for a given set of data points.

The method does not directly return the predicted labels. Rather, labels are added as flair.data.Label objects to the respective data points. You can then access these predictions by calling flair.data.DataPoint.get_labels() on each data point that you passed through this method.

  • sentences (Union[list[TypeVar(DT, bound= DataPoint)], TypeVar(DT, bound= DataPoint)]) – The data points for which the model should predict labels, most commonly Sentence objects.

  • mini_batch_size (int) – The mini batch size to use. Setting this value higher typically makes predictions faster, but also costs more memory.

  • return_probabilities_for_all_classes (bool) – If set to True, the model will store probabilities for all classes instead of only the predicted class.

  • verbose (bool) – If set to True, will display a progress bar while predicting. By default, this parameter is set to False.

  • return_loss (bool) – Set this to True to return loss (only possible if gold labels are set for the sentences).

  • label_name (Optional[str]) – Optional parameter that if set, changes the identifier of the label type that is predicted. # noqa: E501

  • embedding_storage_mode (Literal['none', 'cpu', 'gpu']) – Default is ‘none’ which is always best. Only set to ‘cpu’ or ‘gpu’ if you wish to not only predict, but also keep the generated embeddings in CPU or GPU memory respectively. ‘gpu’ to store embeddings in GPU memory. # noqa: E501

get_used_tokens(corpus, context_length=0, respect_document_boundaries=True)View on GitHub#
Return type:


classmethod load(model_path)View on GitHub#

Loads a Flair model from the given file or state dictionary.


model_path (Union[str, Path, dict[str, Any]]) – Either the path to the model (as string or Path variable) or the already loaded state dict.

Return type:



The loaded Flair model.

class flair.nn.DeepNCMDecoder(label_dictionary, embeddings_size, use_encoder=True, encoding_dim=None, alpha=0.9, mean_update_method='online', multi_label=False)View on GitHub#

Bases: Module

Deep Nearest Class Mean (DeepNCM) Classifier for text classification tasks.

This model combines deep learning with the Nearest Class Mean (NCM) approach. It uses document embeddings to represent text, optionally applies an encoder, and classifies based on the nearest class prototype in the embedded space.

The model supports various methods for updating class prototypes during training, making it adaptable to different learning scenarios.

This implementation is based on the research paper: Guerriero, S., Caputo, B., & Mensink, T. (2018). DeepNCM: Deep Nearest Class Mean Classifiers. In International Conference on Learning Representations (ICLR) 2018 Workshop. URL: https://openreview.net/forum?id=rkPLZ4JPM

property num_prototypes: int#

The number of class prototypes.

update_prototypes()View on GitHub#

Apply accumulated updates to class prototypes.

Return type:


forward(embedded, label_tensor=None)View on GitHub#

Forward pass of the decoder, which calculates the scores as prototype distances.

  • embedded (Tensor) – Embedded representations of the input sentences.

  • label_tensor (Optional[Tensor]) – True labels for the input sentences as a tensor.

Return type:



Scores as a tensor of distances to class prototypes.

get_prototype(class_name)View on GitHub#

Get the prototype vector for a given class name.


class_name (str) – The name of the class whose prototype vector is requested.


The prototype vector for the given class.

Return type:



ValueError – If the class name is not found in the label dictionary.

get_closest_prototypes(input_vector, top_k=5)View on GitHub#

Get the k closest prototype vectors to the given input vector using the configured distance metric.

  • input_vector (torch.Tensor) – The input vector to compare against prototypes.

  • top_k (int) – The number of closest prototypes to return (default is 5).


Each tuple contains (class_name, distance).

Return type:

list[tuple[str, float]]

class flair.nn.DefaultClassifier(embeddings, label_dictionary, final_embedding_size, dropout=0.0, locked_dropout=0.0, word_dropout=0.0, multi_label=False, multi_label_threshold=0.5, loss_weights=None, decoder=None, inverse_model=False, train_on_gold_pairs_only=False, should_embed_sentence=True)View on GitHub#

Bases: Classifier[DT], Generic[DT, DT2], ABC

Default base class for all Flair models that do classification.

It inherits from flair.nn.Classifier and thus from flair.nn.Model. All features shared by all classifiers are implemented here, including the loss calculation, prediction heads for both single- and multi- label classification and the predict() method. Example implementations of this class are the TextClassifier, RelationExtractor, TextPairClassifier and TokenClassifier.

property multi_label_threshold#
forward_loss(sentences)View on GitHub#

Performs a forward pass and returns a loss tensor for backpropagation.

Implement this to enable training.

Return type:

tuple[Tensor, int]

predict(sentences, mini_batch_size=32, return_probabilities_for_all_classes=False, verbose=False, label_name=None, return_loss=False, embedding_storage_mode='none')View on GitHub#

Predicts the class labels for the given sentences. The labels are directly added to the sentences.

  • sentences (Union[list[TypeVar(DT, bound= DataPoint)], TypeVar(DT, bound= DataPoint)]) – list of sentences to predict

  • mini_batch_size (int) – the amount of sentences that will be predicted within one batch

  • return_probabilities_for_all_classes (bool) – return probabilities for all classes instead of only best predicted

  • verbose (bool) – set to True to display a progress bar

  • return_loss (bool) – set to True to return loss

  • label_name (Optional[str]) – set this to change the name of the label type that is predicted

  • embedding_storage_mode (Literal['none', 'cpu', 'gpu']) – default is ‘none’ which is the best is most cases. Only set to ‘cpu’ or ‘gpu’ if you wish to not only predict, but also keep the generated embeddings in CPU or GPU memory respectively. ‘gpu’ to store embeddings in GPU memory.

classmethod load(model_path)View on GitHub#

Loads a Flair model from the given file or state dictionary.


model_path (Union[str, Path, dict[str, Any]]) – Either the path to the model (as string or Path variable) or the already loaded state dict.

Return type:



The loaded Flair model.

class flair.nn.LabelVerbalizerDecoder(label_embedding, label_dictionary)View on GitHub#

Bases: Module

A class for decoding labels using the idea of siamese networks / bi-encoders. This can be used for all classification tasks in flair.

  • label_encoder (flair.embeddings.TokenEmbeddings) – The label encoder used to encode the labels into an embedding.

  • label_dictionary (flair.data.Dictionary) – The label dictionary containing the mapping between labels and indices.


The label encoder used to encode the labels into an embedding.




The label dictionary containing the mapping between labels and indices.



forward(self, label_embeddings

torch.Tensor, context_embeddings: torch.Tensor) -> torch.Tensor: Takes the label embeddings and context embeddings as input and returns a tensor of label scores.


label_dictionary = corpus.make_label_dictionary(“ner”) label_encoder = TransformerWordEmbeddings(‘bert-base-ucnased’) label_verbalizer_decoder = LabelVerbalizerDecoder(label_encoder, label_dictionary)

static verbalize_labels(label_dictionary)View on GitHub#

Takes a label dictionary and returns a list of sentences with verbalized labels.


label_dictionary (flair.data.Dictionary) – The label dictionary to verbalize.

Return type:



A list of sentences with verbalized labels.


label_dictionary = corpus.make_label_dictionary(“ner”) verbalized_labels = LabelVerbalizerDecoder.verbalize_labels(label_dictionary) print(verbalized_labels) [Sentence: “begin person”, Sentence: “inside person”, Sentence: “end person”, Sentence: “single org”, …]

forward(inputs)View on GitHub#

Forward pass of the label verbalizer decoder.


inputs (torch.Tensor) – The input tensor.

Return type:



The scores of the decoder.


RuntimeError – If an unknown decoding type is specified.

class flair.nn.LockedDropout(dropout_rate=0.5, batch_first=True, inplace=False)View on GitHub#

Bases: Module

Implementation of locked (or variational) dropout.

Randomly drops out entire parameters in embedding space.

forward(x)View on GitHub#

Define the computation performed at every call.

Should be overridden by all subclasses.


Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

extra_repr()View on GitHub#

Return the extra representation of the module.

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

class flair.nn.Model(*args, **kwargs)View on GitHub#

Bases: Module, Generic[DT], ABC

Abstract base class for all downstream task models in Flair, such as SequenceTagger and TextClassifier.

Every new type of model must implement these methods.

model_card: Optional[dict[str, Any]] = None#
abstract property label_type: str#

Each model predicts labels of a certain type.

abstract forward_loss(data_points)View on GitHub#

Performs a forward pass and returns a loss tensor for backpropagation.

Implement this to enable training.

Return type:

tuple[Tensor, int]

abstract evaluate(data_points, gold_label_type, out_path=None, embedding_storage_mode='none', mini_batch_size=32, main_evaluation_metric=('micro avg', 'f1-score'), exclude_labels=None, gold_label_dictionary=None, return_loss=True, **kwargs)View on GitHub#

Evaluates the model. Returns a Result object containing evaluation results and a loss value.

Implement this to enable evaluation.

  • data_points (Union[list[TypeVar(DT, bound= DataPoint)], Dataset]) – The labeled data_points to evaluate.

  • gold_label_type (str) – The label type indicating the gold labels

  • out_path (Union[str, Path, None]) – Optional output path to store predictions.

  • embedding_storage_mode (Literal['none', 'cpu', 'gpu']) – One of ‘none’, ‘cpu’ or ‘gpu’. ‘none’ means all embeddings are deleted and freshly recomputed, ‘cpu’ means all embeddings are stored on CPU, or ‘gpu’ means all embeddings are stored on GPU

  • mini_batch_size (int) – The batch_size to use for predictions.

  • main_evaluation_metric (tuple[str, str]) – Specify which metric to highlight as main_score.

  • exclude_labels (Optional[list[str]]) – Specify classes that won’t be considered in evaluation.

  • gold_label_dictionary (Optional[Dictionary]) – Specify which classes should be considered, all other classes will be taken as <unk>.

  • return_loss (bool) – Weather to additionally compute the loss on the data-points.

  • **kwargs – Arguments that will be ignored.

Return type:



The evaluation results.

save(model_file, checkpoint=False)View on GitHub#

Saves the current model to the provided file.

  • model_file (Union[str, Path]) – The model file.

  • checkpoint (bool) – This parameter is currently unused.

Return type:


classmethod load(model_path)View on GitHub#

Loads a Flair model from the given file or state dictionary.


model_path (Union[str, Path, dict[str, Any]]) – Either the path to the model (as string or Path variable) or the already loaded state dict.

Return type:



The loaded Flair model.

print_model_card()View on GitHub#

This method produces a log message that includes all recorded parameters the model was trained with.

The model card includes information such as the Flair, PyTorch and Transformers versions used during training, and the training parameters.

Only available for models trained with with Flair >= 0.9.1.

class flair.nn.PrototypicalDecoder(num_prototypes, embeddings_size, prototype_size=None, distance_function='euclidean', use_radius=False, min_radius=0, unlabeled_distance=None, unlabeled_idx=None, learning_mode='joint', normal_distributed_initial_prototypes=False)View on GitHub#

Bases: Module

property num_prototypes#
forward(embedded)View on GitHub#

Define the computation performed at every call.

Should be overridden by all subclasses.


Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class flair.nn.WordDropout(dropout_rate=0.05, inplace=False)View on GitHub#

Bases: Module

Implementation of word dropout.

Randomly drops out entire words (or characters) in embedding space.

forward(x)View on GitHub#

Define the computation performed at every call.

Should be overridden by all subclasses.


Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

extra_repr()View on GitHub#

Return the extra representation of the module.

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.