flair.datasets.relation_extraction#
- flair.datasets.relation_extraction.convert_ptb_token(token)View on GitHub#
Convert PTB tokens to normal tokens.
- Return type:
str
- class flair.datasets.relation_extraction.RE_ENGLISH_SEMEVAL2010(base_path=None, in_memory=True, augment_train=False, **corpusargs)View on GitHub#
Bases:
ColumnCorpus
- __init__(base_path=None, in_memory=True, augment_train=False, **corpusargs)View on GitHub#
SemEval-2010 Task 8 on Multi-Way Classification of Semantic Relations Between Pairs of Nominals.
- extract_and_convert_to_conllu(data_file, data_folder, augment_train)View on GitHub#
- class flair.datasets.relation_extraction.RE_ENGLISH_TACRED(base_path=None, in_memory=True, **corpusargs)View on GitHub#
Bases:
ColumnCorpus
- __init__(base_path=None, in_memory=True, **corpusargs)View on GitHub#
TAC Relation Extraction Dataset.
with 41 relations from https://nlp.stanford.edu/projects/tacred/. Manual download is required for this dataset.
- extract_and_convert_to_conllu(data_file, data_folder)View on GitHub#
- class flair.datasets.relation_extraction.RE_ENGLISH_CONLL04(base_path=None, in_memory=True, **corpusargs)View on GitHub#
Bases:
ColumnCorpus
- convert_to_conllu(source_data_folder, data_folder)View on GitHub#
- class flair.datasets.relation_extraction.RE_ENGLISH_DRUGPROT(base_path=None, in_memory=True, sentence_splitter=<flair.splitter.SegtokSentenceSplitter object>, **corpusargs)View on GitHub#
Bases:
ColumnCorpus
- __init__(base_path=None, in_memory=True, sentence_splitter=<flair.splitter.SegtokSentenceSplitter object>, **corpusargs)View on GitHub#
Initialize the DrugProt corpus.
Biocreative VII Track 1 from https://zenodo.org/record/5119892#.YSdSaVuxU5k/ on drug and chemical-protein interactions.
- extract_and_convert_to_conllu(data_file, data_folder)View on GitHub#
- char_spans_to_token_spans(char_spans, token_offsets)View on GitHub#
- has_overlap(a, b)View on GitHub#
- drugprot_document_to_tokenlists(pmid, title_sentences, abstract_sentences, abstract_offset, entities, relations)View on GitHub#
- Return type:
list
[TokenList
]