Relationship extraction

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

A relationship extraction task requires the detection and classification of semantic relationship mentions within a set of artifacts, typically from text or XML documents. The task is very similar to that of information extraction (IE), but IE additionally requires the removal of repeated relations (disambiguation) and generally refers to the extraction of many different relationships.


Application domains where relationship extraction is useful include gene-disease relationships,[1] protein-protein interaction[2] etc.

Never-Ending Language Learning is a semantic machine learning system developed by a research team at Carnegie Mellon University that extracts relationships from the open web.


One approach to this problem involves the use of domain ontologies.[3][4] Another approach involves visual detection of meaningful relationships in parametric values of objects listed on a data table that shift positions as the table is permuted automatically as controlled by the software user. The poor coverage, rarity and development cost related to structured resources such as semantic lexicons (e.g. WordNet, UMLS) and domain ontologies (e.g. the Gene Ontology) has given rise to new approaches based on broad, dynamic background knowledge on the Web. For instance, the ARCHILES technique[5] uses only Wikipedia and search engine page count for acquiring coarse-grained relations to construct lightweight ontologies.

The relationships can be represented using a variety of formalisms/languages. One such representation language for data on the Web is RDF.

See also[edit]


  1. ^ Hong-Woo Chun; Yoshimasa Tsuruoka; Jin-Dong Kim; Rie Shiba; Naoki Nagata; Teruyoshi Hishiki; Jun-ichi Tsujii (2006). "Extraction of Gene-Disease Relations from Medline Using Domain Dictionaries and Machine Learning". Pacific Symposium on Biocomputing.
  2. ^ Minlie Huang and Xiaoyan Zhu and Yu Hao and Donald G. Payan and Kunbin Qu and Ming Li (2004). "Discovering patterns to extract protein-protein interactions from full texts". 20 (18): 3604–3612. doi:10.1093/bioinformatics/bth451.
  3. ^ T.C.Rindflesch and L.Tanabe and J.N.Weinstein and L.Hunter (2000). "EDGAR: Extraction of drugs, genes, and relations from the biomedical literature". Proc. Pacific Symposium on Biocomputing. pp. 514–525.
  4. ^ C. Ramakrishnan and K. J. Kochut and A. P. Sheth (2006). "A Framework for Schema-Driven Relationship Discovery from Unstructured Text". Proc. International Semantic Web Conference. pp. 583–596.
  5. ^ W. Wong and W. Liu and M. Bennamoun (2009). "Acquiring Semantic Relations using the Web for Constructing Lightweight Ontologies". Proc. 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD). doi:10.1007/978-3-642-01307-2_26.