flair.data.DataPair#

class flair.data.DataPair(first, second)View on GitHub#

Bases: DataPoint, Generic[DT, DT2]

Represents a pair of DataPoints, often used for sentence-pair tasks.

__init__(first, second)View on GitHub#

Initializes a DataPair.

Parameters:
  • first (DT) – The first data point.

  • second (DT2) – The second data point.

Methods

__init__(first, second)

Initializes a DataPair.

add_label(typename, value[, score])

Adds a new label to a specific annotation layer.

add_metadata(key, value)

Adds a key-value pair to the data point's metadata.

clear_embeddings([embedding_names])

Removes stored embeddings to free memory.

get_each_embedding([embedding_names])

Retrieves a list of individual embedding tensors.

get_embedding([names])

Retrieves embeddings, concatenating if multiple names are given or if names is None.

get_label([label_type, zero_tag_value])

Retrieves the primary label for a given type, or a default 'O' label.

get_labels([typename])

Retrieves all labels for a specific annotation layer.

get_metadata(key)

Retrieves metadata associated with the given key.

has_label(typename)

Checks if the data point has at least one label for the given annotation type.

has_metadata(key)

Checks if the data point has metadata for the given key.

remove_labels(typename)

Removes all labels associated with a specific annotation layer.

set_embedding(name, vector)

Stores an embedding tensor under a given name.

set_label(typename, value[, score])

Sets the label(s) for an annotation layer, overwriting any existing ones.

to(device[, pin_memory])

Moves all stored embedding tensors to the specified device.

Attributes

embedding

Provides the primary embedding representation of the data point.

end_position

The ending character offset (exclusive) within the original text.

labels

Returns a list of all labels from all annotation layers.

score

Shortcut property for the score of the first label added.

start_position

The starting character offset within the original text.

tag

Shortcut property for the value of the first label added.

text

The textual representation of this data point.

unlabeled_identifier

A string identifier for the data point itself, without label info.

to(device, pin_memory=False)View on GitHub#

Moves all stored embedding tensors to the specified device.

Parameters:
  • device (Union[str, torch.device]) – Target device (e.g., ‘cpu’, ‘cuda:0’).

  • pin_memory (bool, optional) – If True and moving to CUDA, attempts to pin memory. Defaults to False.

clear_embeddings(embedding_names=None)View on GitHub#

Removes stored embeddings to free memory.

Parameters:

embedding_names (Optional[list[str]], optional) – Specific names to remove. If None, removes all embeddings. Defaults to None.

property embedding#

Provides the primary embedding representation of the data point.

property unlabeled_identifier#

A string identifier for the data point itself, without label info.

property start_position: int#

The starting character offset within the original text.

property end_position: int#

The ending character offset (exclusive) within the original text.

property text#

The textual representation of this data point.