flair.nn#
- class flair.nn.LockedDropout(dropout_rate=0.5, batch_first=True, inplace=False)View on GitHub#
Bases:
Module
Implementation of locked (or variational) dropout.
Randomly drops out entire parameters in embedding space.
- forward(x)View on GitHub#
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- extra_repr()View on GitHub#
Set the extra representation of the module.
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
- class flair.nn.WordDropout(dropout_rate=0.05, inplace=False)View on GitHub#
Bases:
Module
Implementation of word dropout.
Randomly drops out entire words (or characters) in embedding space.
- forward(x)View on GitHub#
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- extra_repr()View on GitHub#
Set the extra representation of the module.
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
- class flair.nn.Classifier(*args, **kwargs)View on GitHub#
Bases:
Model
[DT
],Generic
[DT
],ReduceTransformerVocabMixin
,ABC
Abstract base class for all Flair models that do classification.
The classifier inherits from flair.nn.Model and adds unified functionality for both, single- and multi-label classification and evaluation. Therefore, it is ensured to have a fair comparison between multiple classifiers.
- evaluate(data_points, gold_label_type, out_path=None, embedding_storage_mode='none', mini_batch_size=32, main_evaluation_metric=('micro avg', 'f1-score'), exclude_labels=[], gold_label_dictionary=None, return_loss=True, **kwargs)View on GitHub#
Evaluates the model. Returns a Result object containing evaluation results and a loss value.
Implement this to enable evaluation.
- Parameters:
data_points (
Union
[List
[TypeVar
(DT
, bound=DataPoint
)],Dataset
]) – The labeled data_points to evaluate.gold_label_type (
str
) – The label type indicating the gold labelsout_path (
Union
[str
,Path
,None
]) – Optional output path to store predictionsembedding_storage_mode (
str
) – One of ‘none’, ‘cpu’ or ‘gpu’. ‘none’ means all embeddings are deleted and freshly recomputed, ‘cpu’ means all embeddings are stored on CPU, or ‘gpu’ means all embeddings are stored on GPUmini_batch_size (
int
) – The batch_size to use for predictionsmain_evaluation_metric (
Tuple
[str
,str
]) – Specify which metric to highlight as main_scoreexclude_labels (
List
[str
]) – Specify classes that won’t be considered in evaluationgold_label_dictionary (
Optional
[Dictionary
]) – Specify which classes should be considered, all other classes will be taken as <unk>.return_loss (
bool
) – Weather to additionally compute the loss on the data-points.**kwargs – Arguments that will be ignored.
- Return type:
Result
- Returns:
The evaluation results.
- abstract predict(sentences, mini_batch_size=32, return_probabilities_for_all_classes=False, verbose=False, label_name=None, return_loss=False, embedding_storage_mode='none')View on GitHub#
Predicts the class labels for the given sentences.
The labels are directly added to the sentences.
- Parameters:
sentences (
Union
[List
[TypeVar
(DT
, bound=DataPoint
)],TypeVar
(DT
, bound=DataPoint
)]) – list of sentencesmini_batch_size (
int
) – mini batch size to usereturn_probabilities_for_all_classes (
bool
) – return probabilities for all classes instead of only best predictedverbose (
bool
) – set to True to display a progress barreturn_loss – set to True to return loss
label_name (
Optional
[str
]) – set this to change the name of the label type that is predicted # noqa: E501embedding_storage_mode – default is ‘none’ which is always best. Only set to ‘cpu’ or ‘gpu’ if you wish to not only predict, but also keep the generated embeddings in CPU or GPU memory respectively. ‘gpu’ to store embeddings in GPU memory. # noqa: E501
- get_used_tokens(corpus, context_length=0, respect_document_boundaries=True)View on GitHub#
- Return type:
Iterable
[List
[str
]]
- classmethod load(model_path)View on GitHub#
Loads the model from the given file.
- Parameters:
model_path (
Union
[str
,Path
,Dict
[str
,Any
]]) – the model file or the already loaded state dict- Return type:
Returns: the loaded text classifier model
- class flair.nn.DefaultClassifier(embeddings, label_dictionary, final_embedding_size, dropout=0.0, locked_dropout=0.0, word_dropout=0.0, multi_label=False, multi_label_threshold=0.5, loss_weights=None, decoder=None, inverse_model=False, train_on_gold_pairs_only=False, should_embed_sentence=True)View on GitHub#
Bases:
Classifier
[DT
],Generic
[DT
,DT2
],ABC
Default base class for all Flair models that do classification.
It inherits from flair.nn.Classifier and thus from flair.nn.Model. All features shared by all classifiers are implemented here, including the loss calculation, prediction heads for both single- and multi- label classification and the predict() method. Example implementations of this class are the TextClassifier, RelationExtractor, TextPairClassifier and TokenClassifier.
- _filter_data_point(data_point)View on GitHub#
Specify if a data point should be kept.
That way you can remove for example empty texts. Per default all datapoints that have length zero will be removed. Return true if the data point should be kept and false if it should be removed.
- Return type:
bool
- abstract _get_data_points_from_sentence(sentence)View on GitHub#
Returns the data_points to which labels are added.
The results should be of any type that inherits from DataPoint (Sentence, Span, Token, … objects).
- Return type:
List
[TypeVar
(DT2
, bound=DataPoint
)]
- _get_data_points_for_batch(sentences)View on GitHub#
Returns the data_points to which labels are added.
The results should be of any type that inherits from DataPoint (Sentence, Span, Token, … objects).
- Return type:
List
[TypeVar
(DT2
, bound=DataPoint
)]
- _get_label_of_datapoint(data_point)View on GitHub#
Extracts the labels from the data points.
Each data point might return a list of strings, representing multiple labels.
- Return type:
List
[str
]
- property multi_label_threshold#
- forward_loss(sentences)View on GitHub#
Performs a forward pass and returns a loss tensor for backpropagation.
Implement this to enable training.
- Return type:
Tuple
[Tensor
,int
]
- predict(sentences, mini_batch_size=32, return_probabilities_for_all_classes=False, verbose=False, label_name=None, return_loss=False, embedding_storage_mode='none')View on GitHub#
Predicts the class labels for the given sentences. The labels are directly added to the sentences.
- Parameters:
sentences (
Union
[List
[TypeVar
(DT
, bound=DataPoint
)],TypeVar
(DT
, bound=DataPoint
)]) – list of sentences to predictmini_batch_size (
int
) – the amount of sentences that will be predicted within one batchreturn_probabilities_for_all_classes (
bool
) – return probabilities for all classes instead of only best predictedverbose (
bool
) – set to True to display a progress barreturn_loss – set to True to return loss
label_name (
Optional
[str
]) – set this to change the name of the label type that is predictedembedding_storage_mode – default is ‘none’ which is the best is most cases. Only set to ‘cpu’ or ‘gpu’ if you wish to not only predict, but also keep the generated embeddings in CPU or GPU memory respectively. ‘gpu’ to store embeddings in GPU memory.
- classmethod load(model_path)View on GitHub#
Loads the model from the given file.
- Parameters:
model_path (
Union
[str
,Path
,Dict
[str
,Any
]]) – the model file or the already loaded state dict- Return type:
Returns: the loaded text classifier model
- class flair.nn.Model(*args, **kwargs)View on GitHub#
Bases:
Module
,Generic
[DT
],ABC
Abstract base class for all downstream task models in Flair, such as SequenceTagger and TextClassifier.
Every new type of model must implement these methods.
-
model_card:
Optional
[Dict
[str
,Any
]] = None#
- abstract property label_type#
Each model predicts labels of a certain type.
- abstract forward_loss(data_points)View on GitHub#
Performs a forward pass and returns a loss tensor for backpropagation.
Implement this to enable training.
- Return type:
Tuple
[Tensor
,int
]
- abstract evaluate(data_points, gold_label_type, out_path=None, embedding_storage_mode='none', mini_batch_size=32, main_evaluation_metric=('micro avg', 'f1-score'), exclude_labels=[], gold_label_dictionary=None, return_loss=True, **kwargs)View on GitHub#
Evaluates the model. Returns a Result object containing evaluation results and a loss value.
Implement this to enable evaluation.
- Parameters:
data_points (
Union
[List
[TypeVar
(DT
, bound=DataPoint
)],Dataset
]) – The labeled data_points to evaluate.gold_label_type (
str
) – The label type indicating the gold labelsout_path (
Union
[str
,Path
,None
]) – Optional output path to store predictionsembedding_storage_mode (
str
) – One of ‘none’, ‘cpu’ or ‘gpu’. ‘none’ means all embeddings are deleted and freshly recomputed, ‘cpu’ means all embeddings are stored on CPU, or ‘gpu’ means all embeddings are stored on GPUmini_batch_size (
int
) – The batch_size to use for predictionsmain_evaluation_metric (
Tuple
[str
,str
]) – Specify which metric to highlight as main_scoreexclude_labels (
List
[str
]) – Specify classes that won’t be considered in evaluationgold_label_dictionary (
Optional
[Dictionary
]) – Specify which classes should be considered, all other classes will be taken as <unk>.return_loss (
bool
) – Weather to additionally compute the loss on the data-points.**kwargs – Arguments that will be ignored.
- Return type:
Result
- Returns:
The evaluation results.
- _get_state_dict()View on GitHub#
Returns the state dictionary for this model.
- classmethod _init_model_with_state_dict(state, **kwargs)View on GitHub#
Initialize the model from a state dictionary.
- save(model_file, checkpoint=False)View on GitHub#
Saves the current model to the provided file.
- Parameters:
model_file (
Union
[str
,Path
]) – the model filecheckpoint (
bool
) – currently unused.
- classmethod load(model_path)View on GitHub#
Loads the model from the given file.
- Parameters:
model_path (
Union
[str
,Path
,Dict
[str
,Any
]]) – the model file or the already loaded state dict- Return type:
Returns: the loaded text classifier model
- print_model_card()View on GitHub#
-
model_card:
- class flair.nn.PrototypicalDecoder(num_prototypes, embeddings_size, prototype_size=None, distance_function='euclidean', use_radius=False, min_radius=0, unlabeled_distance=None, unlabeled_idx=None, learning_mode='joint', normal_distributed_initial_prototypes=False)View on GitHub#
Bases:
Module
- property num_prototypes#
- forward(embedded)View on GitHub#
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class flair.nn.LabelVerbalizerDecoder(label_embedding, label_dictionary)View on GitHub#
Bases:
Module
A class for decoding labels using the idea of siamese networks / bi-encoders. This can be used for all classification tasks in flair.
- Parameters:
label_encoder (flair.embeddings.TokenEmbeddings) – The label encoder used to encode the labels into an embedding.
label_dictionary (flair.data.Dictionary) – The label dictionary containing the mapping between labels and indices.
- label_encoder#
The label encoder used to encode the labels into an embedding.
- Type:
flair.embeddings.TokenEmbeddings
- label_dictionary#
The label dictionary containing the mapping between labels and indices.
- Type:
- forward(self, label_embeddings
torch.Tensor, context_embeddings: torch.Tensor) -> torch.Tensor: Takes the label embeddings and context embeddings as input and returns a tensor of label scores.
Examples
label_dictionary = corpus.make_label_dictionary(“ner”) label_encoder = TransformerWordEmbeddings(‘bert-base-ucnased’) label_verbalizer_decoder = LabelVerbalizerDecoder(label_encoder, label_dictionary)
- static verbalize_labels(label_dictionary)View on GitHub#
Takes a label dictionary and returns a list of sentences with verbalized labels.
- Parameters:
label_dictionary (flair.data.Dictionary) – The label dictionary to verbalize.
- Return type:
List
[Sentence
]- Returns:
A list of sentences with verbalized labels.
Examples
label_dictionary = corpus.make_label_dictionary(“ner”) verbalized_labels = LabelVerbalizerDecoder.verbalize_labels(label_dictionary) print(verbalized_labels) [Sentence: “begin person”, Sentence: “inside person”, Sentence: “end person”, Sentence: “single org”, …]
- forward(inputs)View on GitHub#
Forward pass of the label verbalizer decoder.
- Parameters:
inputs (torch.Tensor) – The input tensor.
- Return type:
Tensor
- Returns:
The scores of the decoder.
- Raises:
RuntimeError – If an unknown decoding type is specified.