flair.data.Token#
- class flair.data.Token(text, head_id=None, whitespace_after=1, start_position=0, sentence=None)View on GitHub#
Bases:
_PartOfSentence
Represents a single token (word, punctuation) within a Sentence.
- form#
The textual content of the token.
- Type:
str
- idx#
The 1-based index within the sentence (-1 if not attached).
- Type:
int
- head_id#
1-based index of the dependency head.
- Type:
Optional[int]
- whitespace_after#
Number of spaces following this token.
- Type:
int
- start_position#
Character offset where this token begins.
- Type:
int
- __init__(text, head_id=None, whitespace_after=1, start_position=0, sentence=None)View on GitHub#
Initializes a Token.
- Parameters:
text (str) – The token text.
head_id (Optional[int], optional) – 1-based index of dependency head. Defaults to None.
whitespace_after (int, optional) – Spaces after token. Defaults to 1.
start_position (int, optional) – Character start offset. Defaults to 0.
sentence (Optional[Sentence], optional) – Parent sentence. Defaults to None.
Methods
__init__
(text[, head_id, whitespace_after, ...])Initializes a Token.
add_label
(typename, value[, score])Adds a label, propagating it to the parent Sentence's layer.
add_metadata
(key, value)Adds a key-value pair to the data point's metadata.
add_tags_proba_dist
(tag_type, tags)Stores a list of Labels representing a probability distribution for a tag type.
clear_embeddings
([embedding_names])Removes stored embeddings to free memory.
get_each_embedding
([embedding_names])Retrieves a list of individual embedding tensors.
get_embedding
([names])Retrieves embeddings, concatenating if multiple names are given or if names is None.
get_head
()Returns the head Token in the dependency parse, if available.
get_label
([label_type, zero_tag_value])Retrieves the primary label for a given type, or a default 'O' label.
get_labels
([typename])Retrieves all labels for a specific annotation layer.
get_metadata
(key)Retrieves metadata associated with the given key.
get_tags_proba_dist
(tag_type)Retrieves the stored probability distribution for a given tag type.
has_label
(typename)Checks if the data point has at least one label for the given annotation type.
has_metadata
(key)Checks if the data point has metadata for the given key.
remove_labels
(typename)Removes labels of a type, also removing them from the parent Sentence layer.
set_embedding
(name, vector)Stores an embedding tensor under a given name.
set_label
(typename, value[, score])Sets a label (overwriting), propagating the change to the parent Sentence.
to
(device[, pin_memory])Moves all stored embedding tensors to the specified device.
to_dict
([tag_type])Attributes
Returns the concatenated embeddings stored for this token.
Character offset where the token ends (exclusive).
The 1-based index within the sentence (-1 if not attached).
labels
Returns a list of all labels from all annotation layers.
score
Shortcut property for the score of the first label added.
Character offset where the token begins in the Sentence text.
tag
Shortcut property for the value of the first label added.
The text content of the token.
"<text>"'.
- property idx: int#
The 1-based index within the sentence (-1 if not attached).
- property text: str#
The text content of the token.
- property unlabeled_identifier: str#
“<text>”’.
- Type:
String identifier
- Type:
‘Token[<idx>]
- add_tags_proba_dist(tag_type, tags)View on GitHub#
Stores a list of Labels representing a probability distribution for a tag type.
- Parameters:
tag_type (str) – The annotation layer name (e.g., “pos”).
tags (list[Label]) – List of Labels, each with a tag value and probability score.
- Return type:
None
- get_tags_proba_dist(tag_type)View on GitHub#
Retrieves the stored probability distribution for a given tag type.
- Parameters:
tag_type (str) – The annotation layer name.
- Returns:
- List of Labels representing the distribution,
or empty list if none stored.
- Return type:
list[Label]
- get_head()View on GitHub#
Returns the head Token in the dependency parse, if available.
- Return type:
Optional
[Token
]
- property start_position: int#
Character offset where the token begins in the Sentence text.
- property end_position: int#
Character offset where the token ends (exclusive).
- property embedding: Tensor#
Returns the concatenated embeddings stored for this token.
- add_label(typename, value, score=1.0, **metadata)View on GitHub#
Adds a label, propagating it to the parent Sentence’s layer.
- set_label(typename, value, score=1.0, **metadata)View on GitHub#
Sets a label (overwriting), propagating the change to the parent Sentence.
- to_dict(tag_type=None)View on GitHub#
- Return type:
dict
[str
,Any
]