flair.datasets.relation_extraction#

flair.datasets.relation_extraction.convert_ptb_token(token)View on GitHub#

Convert PTB tokens to normal tokens.

Return type:

str

class flair.datasets.relation_extraction.RE_ENGLISH_SEMEVAL2010(base_path=None, in_memory=True, augment_train=False, **corpusargs)View on GitHub#

Bases: ColumnCorpus

__init__(base_path=None, in_memory=True, augment_train=False, **corpusargs)View on GitHub#

SemEval-2010 Task 8 on Multi-Way Classification of Semantic Relations Between Pairs of Nominals.

see https://aclanthology.org/S10-1006.pdf

extract_and_convert_to_conllu(data_file, data_folder, augment_train)View on GitHub#
class flair.datasets.relation_extraction.RE_ENGLISH_TACRED(base_path=None, in_memory=True, **corpusargs)View on GitHub#

Bases: ColumnCorpus

__init__(base_path=None, in_memory=True, **corpusargs)View on GitHub#

TAC Relation Extraction Dataset.

with 41 relations from https://nlp.stanford.edu/projects/tacred/. Manual download is required for this dataset.

extract_and_convert_to_conllu(data_file, data_folder)View on GitHub#
class flair.datasets.relation_extraction.RE_ENGLISH_CONLL04(base_path=None, in_memory=True, **corpusargs)View on GitHub#

Bases: ColumnCorpus

convert_to_conllu(source_data_folder, data_folder)View on GitHub#
class flair.datasets.relation_extraction.RE_ENGLISH_DRUGPROT(base_path=None, in_memory=True, sentence_splitter=<flair.splitter.SegtokSentenceSplitter object>, **corpusargs)View on GitHub#

Bases: ColumnCorpus

__init__(base_path=None, in_memory=True, sentence_splitter=<flair.splitter.SegtokSentenceSplitter object>, **corpusargs)View on GitHub#

Initialize the DrugProt corpus.

Biocreative VII Track 1 from https://zenodo.org/record/5119892#.YSdSaVuxU5k/ on drug and chemical-protein interactions.

extract_and_convert_to_conllu(data_file, data_folder)View on GitHub#
char_spans_to_token_spans(char_spans, token_offsets)View on GitHub#
has_overlap(a, b)View on GitHub#
drugprot_document_to_tokenlists(pmid, title_sentences, abstract_sentences, abstract_offset, entities, relations)View on GitHub#
Return type:

list[TokenList]