flair.data.DataPoint#
- class flair.data.DataPointView on GitHub#
Bases:
ABC
Abstract base class for all data points in Flair (e.g., Token, Sentence, Image).
Defines core functionalities like holding embeddings, managing labels across different annotation layers, and providing basic positional/textual info.
- __init__()View on GitHub#
Initializes a DataPoint with empty annotation/embedding/metadata storage.
Methods
__init__
()Initializes a DataPoint with empty annotation/embedding/metadata storage.
add_label
(typename, value[, score])Adds a new label to a specific annotation layer.
add_metadata
(key, value)Adds a key-value pair to the data point's metadata.
clear_embeddings
([embedding_names])Removes stored embeddings to free memory.
get_each_embedding
([embedding_names])Retrieves a list of individual embedding tensors.
get_embedding
([names])Retrieves embeddings, concatenating if multiple names are given or if names is None.
get_label
([label_type, zero_tag_value])Retrieves the primary label for a given type, or a default 'O' label.
get_labels
([typename])Retrieves all labels for a specific annotation layer.
get_metadata
(key)Retrieves metadata associated with the given key.
has_label
(typename)Checks if the data point has at least one label for the given annotation type.
has_metadata
(key)Checks if the data point has metadata for the given key.
remove_labels
(typename)Removes all labels associated with a specific annotation layer.
set_embedding
(name, vector)Stores an embedding tensor under a given name.
set_label
(typename, value[, score])Sets the label(s) for an annotation layer, overwriting any existing ones.
to
(device[, pin_memory])Moves all stored embedding tensors to the specified device.
Attributes
Provides the primary embedding representation of the data point.
The ending character offset (exclusive) within the original text.
Returns a list of all labels from all annotation layers.
Shortcut property for the score of the first label added.
The starting character offset within the original text.
Shortcut property for the value of the first label added.
The textual representation of this data point.
A string identifier for the data point itself, without label info.
- abstract property embedding: Tensor#
Provides the primary embedding representation of the data point.
- set_embedding(name, vector)View on GitHub#
Stores an embedding tensor under a given name.
- Parameters:
name (str) – The name to identify this embedding (e.g., “word”, “flair”).
vector (torch.Tensor) – The embedding tensor.
- get_embedding(names=None)View on GitHub#
Retrieves embeddings, concatenating if multiple names are given or if names is None.
- Parameters:
names (Optional[list[str]], optional) – Specific embedding names to retrieve. If None, concatenates all stored embeddings sorted by name. Defaults to None.
- Returns:
- A single tensor representing the requested embedding(s).
Returns an empty tensor if no relevant embeddings are found.
- Return type:
torch.Tensor
- get_each_embedding(embedding_names=None)View on GitHub#
Retrieves a list of individual embedding tensors.
- Parameters:
embedding_names (Optional[list[str]], optional) – If provided, filters by these names. Otherwise, returns all stored embeddings. Defaults to None.
- Returns:
List of embedding tensors, sorted by name.
- Return type:
list[torch.Tensor]
- to(device, pin_memory=False)View on GitHub#
Moves all stored embedding tensors to the specified device.
- Parameters:
device (Union[str, torch.device]) – Target device (e.g., ‘cpu’, ‘cuda:0’).
pin_memory (bool, optional) – If True and moving to CUDA, attempts to pin memory. Defaults to False.
- Return type:
None
- clear_embeddings(embedding_names=None)View on GitHub#
Removes stored embeddings to free memory.
- Parameters:
embedding_names (Optional[list[str]], optional) – Specific names to remove. If None, removes all embeddings. Defaults to None.
- Return type:
None
- has_label(typename)View on GitHub#
Checks if the data point has at least one label for the given annotation type.
- Return type:
bool
- add_metadata(key, value)View on GitHub#
Adds a key-value pair to the data point’s metadata.
- Return type:
None
- get_metadata(key)View on GitHub#
Retrieves metadata associated with the given key.
- Parameters:
key (str) – The metadata key.
- Returns:
The metadata value.
- Return type:
Any
- Raises:
KeyError – If the key is not found.
- has_metadata(key)View on GitHub#
Checks if the data point has metadata for the given key.
- Return type:
bool
- add_label(typename, value, score=1.0, **metadata)View on GitHub#
Adds a new label to a specific annotation layer.
- Parameters:
typename (str) – Name of the annotation layer (e.g., “ner”, “sentiment”).
value (str) – String value of the label (e.g., “PERSON”, “POSITIVE”).
score (float, optional) – Confidence score (0.0-1.0). Defaults to 1.0.
**metadata – Additional keyword arguments stored as metadata on the Label.
- Returns:
Returns self for chaining.
- Return type:
- set_label(typename, value, score=1.0, **metadata)View on GitHub#
Sets the label(s) for an annotation layer, overwriting any existing ones.
- Parameters:
typename (str) – The name of the annotation layer.
value (str) – The string value of the new label.
score (float, optional) – Confidence score (0.0-1.0). Defaults to 1.0.
**metadata – Additional keyword arguments for the new Label’s metadata.
- Returns:
Returns self for chaining.
- Return type:
- remove_labels(typename)View on GitHub#
Removes all labels associated with a specific annotation layer.
- Parameters:
typename (str) – The name of the annotation layer to clear.
- Return type:
None
- get_label(label_type=None, zero_tag_value='O')View on GitHub#
Retrieves the primary label for a given type, or a default ‘O’ label.
- Parameters:
label_type (Optional[str], optional) – The annotation layer name. Defaults to None (uses first overall label).
zero_tag_value (str, optional) – Value for the default label if none found. Defaults to “O”.
- Returns:
The primary label, or a default label with score 0.0.
- Return type:
- get_labels(typename=None)View on GitHub#
Retrieves all labels for a specific annotation layer.
- Parameters:
typename (Optional[str], optional) – The layer name. If None, returns all labels from all layers. Defaults to None.
- Returns:
List of Label objects, or empty list if none found.
- Return type:
list[Label]
- abstract property unlabeled_identifier: str#
A string identifier for the data point itself, without label info.
- abstract property start_position: int#
The starting character offset within the original text.
- abstract property end_position: int#
The ending character offset (exclusive) within the original text.
- abstract property text: str#
The textual representation of this data point.
- property tag: str#
Shortcut property for the value of the first label added.
- property score: float#
Shortcut property for the score of the first label added.