flair.tokenization.JapaneseTokenizer#

class flair.tokenization.JapaneseTokenizer(tokenizer, sudachi_mode='A')View on GitHub#

Bases: Tokenizer

Tokenizer using konoha to support popular japanese tokenizers.

Tokenizer using konoha, a third party library which supports multiple Japanese tokenizer such as MeCab, Janome and SudachiPy.

For further details see:

himkt/konoha

__init__(tokenizer, sudachi_mode='A')View on GitHub#

Methods

__init__(tokenizer[, sudachi_mode])

from_dict(config)

Instantiate the tokenizer from a configuration dictionary.

to_dict()

Serialize the tokenizer's configuration to a dictionary.

tokenize(text)

Attributes

name

tokenize(text)View on GitHub#
Return type:

list[str]

property name: str#
to_dict()View on GitHub#

Serialize the tokenizer’s configuration to a dictionary.

Return type:

dict

classmethod from_dict(config)View on GitHub#

Instantiate the tokenizer from a configuration dictionary.

Return type:

JapaneseTokenizer