flair.data#

BoundingBox

Represents a bounding box with left, top, right, and bottom coordinates.

ConcatFlairDataset

Concatenates multiple datasets, adding a multitask_id label to each sentence.

Corpus

The main container for holding train, dev, and test datasets for a task.

DT

Type variable.

DT2

Type variable.

DT3

Type variable.

DataPair

Represents a pair of DataPoints, often used for sentence-pair tasks.

DataPoint

Abstract base class for all data points in Flair (e.g., Token, Sentence, Image).

DataTriple

Represents a triplet of DataPoints.

Dictionary

This class holds a dictionary that maps strings to unique integer IDs.

EntityCandidate

Represents a potential candidate entity from a knowledge base for entity linking.

FlairDataset

Abstract base class for Flair datasets, adding an in-memory check.

Image

Represents an image as a data point, holding image data or a URL.

Label

Represents a label assigned to a DataPoint (e.g., Token, Span, Sentence).

MultiCorpus

A Corpus composed of multiple individual Corpus objects, often for multi-task learning.

Relation

Represents a directed relationship between two Spans in the same Sentence.

Sentence

A central data structure representing a sentence or text passage as Tokens.

Span

Represents a contiguous sequence of Tokens within a Sentence.

T_co

Type variable.

TextPair

Type alias for a DataPair consisting of two Sentences.

TextTriple

Type alias for a DataTriple consisting of three Sentences.

Token

Represents a single token (word, punctuation) within a Sentence.

_PartOfSentence

Abstract base for data points within a Sentence (Token, Span, Relation).

_iter_dataset

Iterates over a Dataset yielding single data points.

_len_dataset

Calculates the length (number of data points) in a Dataset.

get_spans_from_bio

Decodes a sequence of BIOES/BIO tags into labeled spans with scores.

randomly_split_into_two_datasets

Shuffles and splits a dataset into two Subsets.