flair.tokenization.SciSpacyTokenizer#
- class flair.tokenization.SciSpacyTokenizerView on GitHub#
Bases:
Tokenizer
Tokenizer that uses the en_core_sci_sm Spacy model and some special heuristics.
Implementation of
Tokenizer
which uses the en_core_sci_sm Spacy model extended by special heuristics to consider characters such as “(”, “)” “-” as additional token separators. The latter distinguishes this implementation fromSpacyTokenizer
.Note, you if you want to use the “normal” SciSpacy tokenization just use
SpacyTokenizer
.- __init__()View on GitHub#
Methods
__init__
()tokenize
(text)Attributes
- tokenize(text)View on GitHub#
- Return type:
list
[str
]
- property name: str#