flair.tokenization.Tokenizer#
- class flair.tokenization.TokenizerView on GitHub#
Bases:
ABC
An abstract class representing a
Tokenizer
.Tokenizers are used to represent algorithms and models to split plain text into individual tokens / words. All subclasses should overwrite
tokenize()
, which splits the given plain text into tokens. Moreover, subclasses may overwritename()
, returning a unique identifier representing the tokenizer’s configuration.- __init__()#
Methods
__init__
()from_dict
(config)Instantiates the tokenizer from a configuration dictionary.
to_dict
()Serializes the tokenizer's configuration to a dictionary.
tokenize
(text)Attributes
- abstract tokenize(text)View on GitHub#
- Return type:
list
[str
]
- property name: str#
- abstract to_dict()View on GitHub#
Serializes the tokenizer’s configuration to a dictionary.
- Return type:
dict
[str
,Any
]
- abstract classmethod from_dict(config)View on GitHub#
Instantiates the tokenizer from a configuration dictionary.
- Return type: