Natural language understanding or question answering can considerably benefit from commonsense knowledge. However, generating broad and diverse common knowledge is still an open task. A recent paper on arXiv.org explores dictionary term definitions as a possible source of common knowledge.
Image credit: Kasharp via Wikimedia, CC-BY-SA-4.0
The English version of Wiktionary is used to extract phrases from term definitions. They are then used to construct knowledge triple candidates for each relation using a term as the subject and an extracted phrase as the object. Three state-of-the-art machine learning approaches for scoring the triples are used.
The validity and novelty of extracted triples are evaluated to understand the potential of commonsense mining from definitions. The results show that some valid and novel triples are mined. Nevertheless, all three models have some weaknesses; therefore, careful pre-evaluation is needed to apply the approach in practice.
Commonsense knowledge has proven to be beneficial to a variety of application areas, including question answering and natural language understanding. Previous work explored collecting commonsense knowledge triples automatically from text to increase the coverage of current commonsense knowledge graphs. We investigate a few machine learning approaches to mining commonsense knowledge triples using dictionary term definitions as inputs and provide some initial evaluation of the results. We start from extracting candidate triples using part-of-speech tag patterns from text, and then compare the performance of three existing models for triple scoring. Our experiments show that term definitions contain some valid and novel commonsense knowledge triples for some semantic relations, and also indicate some challenges with using existing triple scoring models.
Research paper: Liang, Z. and McGuinness, D. L., “Commonsense Knowledge Mining from Term Definitions”, 2021. Link: https://arxiv.org/abs/2102.00651