flair.datasets.base.StringDataset#

class flair.datasets.base.StringDataset(texts, use_tokenizer=<flair.tokenization.SpaceTokenizer object>)View on GitHub#

Bases: FlairDataset

A Dataset taking string as input and returning Sentence during iteration.

__init__(texts, use_tokenizer=<flair.tokenization.SpaceTokenizer object>)View on GitHub#

Instantiate StringDataset.

Parameters:
  • texts (Union[str, list[str]]) – a string or List of string that make up StringDataset

  • use_tokenizer (Union[bool, Tokenizer]) – Custom tokenizer to use. If instead of providing a function, this parameter is just set to True, flair.tokenization.SegTokTokenizer will be used.

Methods

__init__(texts[, use_tokenizer])

Instantiate StringDataset.

is_in_memory()

abstract is_in_memory()View on GitHub#
Return type:

bool