flair.datasets.biomedical#

ANAT_EM

Corpus for anatomical named entity mention recognition.

AZDZ

Arizona Disease Corpus from the Biomedical Informatics Lab at Arizona State University.

BC2GM

Original BioCreative-II-GM corpus containing gene annotations.

BIGBIO_NER_CORPUS

This class implements an adapter to data sets implemented in the BigBio framework.

BIOBERT_CHEMICAL_BC4CHEMD

BC4CHEMD corpus with chemical annotations as used in the evaluation of BioBERT.

BIOBERT_CHEMICAL_BC5CDR

BC5CDR corpus with chemical annotations as used in the evaluation of BioBERT.

BIOBERT_DISEASE_BC5CDR

BC5CDR corpus with disease annotations as used in the evaluation of BioBERT.

BIOBERT_DISEASE_NCBI

NCBI disease corpus as used in the evaluation of BioBERT.

BIOBERT_GENE_BC2GM

BC4CHEMD corpus with gene annotations as used in the evaluation of BioBERT.

BIOBERT_GENE_JNLPBA

JNLPBA corpus with gene annotations as used in the evaluation of BioBERT.

BIOBERT_SPECIES_LINNAEUS

Linneaeus corpus with species annotations as used in the evaluation of BioBERT.

BIOBERT_SPECIES_S800

S800 corpus with species annotations as used in the evaluation of BioBERT.

BIONLP2013_CG

Corpus of the BioNLP'2013 Cancer Genetics shared task.

BIONLP2013_PC

Corpus of the BioNLP'2013 Pathway Curation shared task.

BIOSEMANTICS

Original Biosemantics corpus.

BIO_INFER

Original BioInfer corpus.

BioBertHelper

Helper class to convert corpora and the respective train, dev and test split used by BioBERT.

BioNLPCorpus

Base class for corpora from BioNLP event extraction shared tasks.

CDR

CDR corpus as provided by JHnlp/BioCreative-V-CDR-Corpus.

CELL_FINDER

Original CellFinder corpus containing cell line, species and gene annotations.

CELL_LINE_TAG

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

CEMP

Original CEMP corpus containing chemical annotations.

CHEBI

Original CHEBI corpus containing all annotations.

CHEMDNER

Original corpus of the CHEMDNER shared task.

CHEMICAL_TAG

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

CLL

Original CLL corpus containing cell line annotations.

CRAFT

Original CRAFT corpus (version 2.0) containing all but the coreference and sections/typography annotations.

CRAFT_V4

Version 4.0.1 of the CRAFT corpus containing all but the co-reference and structural annotations.

CoNLLWriter

Utility class for writing InternalBioNerDataset to CoNLL files.

DECA

Original DECA corpus containing gene annotations.

DISEASE_TAG

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

DpEntry

Entity

Internal class to represent entities while converting biomedical NER corpora to a standardized format.

FSU

Original FSU corpus containing protein and derived annotations.

GELLUS

Original Gellus corpus containing cell line annotations.

GENE_TAG

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

GPRO

Original GPRO corpus containing gene annotations.

HUNER_ALL_BIOID

HUNER_ALL_BIONLP2013_CG

HUNER_ALL_BIONLP_ST_2011_ID

HUNER_ALL_BIONLP_ST_2013_PC

HUNER_ALL_BIORED

HUNER_ALL_CDR

HUNER version of the IEPA corpus containing disease and chemical annotations.

HUNER_ALL_CELL_FINDER

HUNER version of the CellFinder corpus containing only gene annotations.

HUNER_ALL_CHEBI

HUNER version of the CHEBI corpus containing chemical, gene and species annotations.

HUNER_ALL_CPI

HUNER_ALL_CRAFT_V4

HUNER version of the CRAFT corpus containing chemical, gene and species annotations.

HUNER_ALL_DRUGPROT

HUNER_ALL_JNLPBA

HUNER version of the JNLPBA corpus containing gene and cell line annotations.

HUNER_ALL_LOCTEXT

HUNER version of the Loctext corpus containing species and protein annotations.

HUNER_ALL_MIRNA

HUNER version of the miRNA corpus containing gene, species and disease annotations.

HUNER_ALL_SCAI

HUNER version of the SCAI chemicals corpus containing chemical and disease annotations.

HUNER_ALL_VARIOME

HUNER version of the Variome corpus containing gene, disease and species annotations.

HUNER_BIONLP2013_CG

HUNER_CELL_LINE

Union of all HUNER cell line data sets.

HUNER_CELL_LINE_BIOID

HUNER_CELL_LINE_BIORED

HUNER_CELL_LINE_CELL_FINDER

HUNER version of the CellFinder corpus containing only cell line annotations.

HUNER_CELL_LINE_CLL

HUNER version of the CLL corpus containing cell line annotations.

HUNER_CELL_LINE_GELLUS

HUNER version of the Gellus corpus containing cell line annotations.

HUNER_CELL_LINE_JNLPBA

HUNER version of the JNLPBA corpus containing cell line annotations.

HUNER_CHEBI

HUNER version of the CHEBI corpus.

HUNER_CHEMICAL

Union of all HUNER chemical data sets.

HUNER_CHEMICAL_BIOID

HUNER_CHEMICAL_BIONLP2013_CG

HUNER_CHEMICAL_BIONLP_ST_2011_ID

HUNER_CHEMICAL_BIONLP_ST_2013_PC

HUNER_CHEMICAL_BIORED

HUNER_CHEMICAL_CDR

HUNER version of the IEPA corpus containing chemical annotations.

HUNER_CHEMICAL_CEMP

HUNER version of the CEMP corpus containing chemical annotations.

HUNER_CHEMICAL_CHEBI

HUNER version of the CHEBI corpus containing chemical annotations.

HUNER_CHEMICAL_CHEMDNER

HUNER version of the CHEMDNER corpus containing chemical annotations.

HUNER_CHEMICAL_CPI

HUNER_CHEMICAL_CRAFT_V4

HUNER version of the CRAFT corpus containing (only) chemical annotations.

HUNER_CHEMICAL_DRUGPROT

HUNER_CHEMICAL_NLM_CHEM

HUNER_CHEMICAL_SCAI

HUNER version of the SCAI chemicals corpus containing chemical annotations.

HUNER_CRAFT_V4

HUNER version of the CRAFT corpus containing (only) chemical annotations.

HUNER_DISEASE

Union of all HUNER disease data sets.

HUNER_DISEASE_BIONLP2013_CG

HUNER_DISEASE_BIORED

HUNER_DISEASE_CDR

HUNER version of the IEPA corpus containing disease annotations.

HUNER_DISEASE_MIRNA

HUNER version of the miRNA corpus containing disease annotations.

HUNER_DISEASE_NCBI

HUNER version of the NCBI corpus containing disease annotations.

HUNER_DISEASE_PDR

PDR Dataset with only Disease annotations.

HUNER_DISEASE_SCAI

HUNER version of the SCAI chemicals corpus containing disease annotations.

HUNER_DISEASE_VARIOME

HUNER version of the Variome corpus containing disease annotations.

HUNER_GENE

Union of all HUNER gene data sets.

HUNER_GENE_BC2GM

HUNER version of the BioCreative-II-GM corpus containing gene annotations.

HUNER_GENE_BIOID

HUNER_GENE_BIONLP2013_CG

HUNER_GENE_BIONLP_ST_2011_EPI

HUNER_GENE_BIONLP_ST_2011_GE

HUNER_GENE_BIONLP_ST_2011_ID

HUNER_GENE_BIONLP_ST_2011_REL

HUNER_GENE_BIONLP_ST_2013_GE

HUNER_GENE_BIONLP_ST_2013_PC

HUNER_GENE_BIORED

HUNER_GENE_BIO_INFER

HUNER version of the BioInfer corpus containing only gene/protein annotations.

HUNER_GENE_CELL_FINDER

HUNER version of the CellFinder corpus containing only gene annotations.

HUNER_GENE_CHEBI

HUNER version of the CHEBI corpus containing gene annotations.

HUNER_GENE_CPI

HUNER_GENE_CRAFT_V4

HUNER version of the CRAFT corpus containing (only) gene annotations.

HUNER_GENE_DECA

HUNER version of the DECA corpus containing gene annotations.

HUNER_GENE_DRUGPROT

HUNER_GENE_FSU

HUNER version of the FSU corpus containing (only) gene annotations.

HUNER_GENE_GNORMPLUS

HUNER_GENE_GPRO

HUNER version of the GPRO corpus containing gene annotations.

HUNER_GENE_IEPA

HUNER version of the IEPA corpus containing gene annotations.

HUNER_GENE_JNLPBA

HUNER version of the JNLPBA corpus containing gene annotations.

HUNER_GENE_LOCTEXT

HUNER version of the Loctext corpus containing protein annotations.

HUNER_GENE_MIRNA

HUNER version of the miRNA corpus containing protein / gene annotations.

HUNER_GENE_NLM_GENE

HUNER_GENE_OSIRIS

HUNER version of the OSIRIS corpus containing (only) gene annotations.

HUNER_GENE_PROGENE

HUNER_GENE_SETH_CORPUS

HUNER_GENE_TMVAR_V3

HUNER_GENE_VARIOME

HUNER version of the Variome corpus containing gene annotations.

HUNER_JNLPBA

HUNER version of the JNLPBA corpus.

HUNER_LOCTEXT

HUNER version of the Loctext corpus.

HUNER_MIRNA

HUNER version of the miRNA corpus.

HUNER_SPECIES

Union of all HUNER species data sets.

HUNER_SPECIES_BIOID

HUNER_SPECIES_BIONLP2013_CG

HUNER_SPECIES_BIONLP_ST_2011_ID

HUNER_SPECIES_BIONLP_ST_2019_BB

HUNER_SPECIES_BIORED

HUNER_SPECIES_CELL_FINDER

HUNER version of the CellFinder corpus containing only species annotations.

HUNER_SPECIES_CHEBI

HUNER version of the CHEBI corpus containing species annotations.

HUNER_SPECIES_CRAFT_V4

HUNER version of the CRAFT corpus containing (only) species annotations.

HUNER_SPECIES_LINNEAUS

HUNER version of the LINNEAUS corpus containing species annotations.

HUNER_SPECIES_LOCTEXT

HUNER version of the Loctext corpus containing species annotations.

HUNER_SPECIES_MIRNA

HUNER version of the miRNA corpus containing species annotations.

HUNER_SPECIES_S800

HUNER version of the S800 corpus containing species annotations.

HUNER_SPECIES_VARIOME

HUNER version of the Variome corpus containing species annotations.

HunerDataset

Base class for HUNER datasets.

HunerJNLPBA

HunerMiRNAHelper

HunerMultiCorpus

Base class to build the union of all HUNER data sets considering a particular entity type.

IEPA

IEPA corpus as provided by http://corpora.informatik.hu-berlin.de/.

InternalBioNerDataset

Internal class to represent a corpus and it's entities.

JNLPBA

Original corpus of the JNLPBA shared task.

KaewphanCorpusHelper

Helper class for the corpora from Kaewphan et al., i.e. CLL and Gellus.

LINNEAUS

Original LINNEAUS corpus containing species annotations.

LOCTEXT

Original LOCTEXT corpus containing species annotations.

MIRNA

Original miRNA corpus.

NCBI_DISEASE

Original NCBI disease corpus containing disease annotations.

OSIRIS

Original OSIRIS corpus containing variation and gene annotations.

PDR

Corpus of plant-disease relations.

S800

S800 corpus.

SCAI_CHEMICALS

Original SCAI chemicals corpus containing chemical annotations.

SCAI_DISEASE

Original SCAI disease corpus containing disease annotations.

SENTENCE_TAG

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

SPECIES_TAG

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

ScaiCorpus

Base class to support the SCAI chemicals and disease corpora.

VARIOME

Variome corpus as provided by http://corpora.informatik.hu-berlin.de/corpora/brat2bioc/hvp_bioc.xml.zip.

bioc_to_internal

Helper function to parse corpora that are given in BIOC format.

brat_to_internal

Helper function to parse corpora that are annotated using BRAT.

filter_and_map_entities

filter_nested_entities

merge_datasets