Structured Knowledge Bases (KBs) are extremely useful for applications such as question answering and dialog, but are difficult to populate and maintain. People prefer expressing information in natural language, and hence text corpora, such as Wikipedia, contain more detailed up-to-date information. This raises the question -- can we directly treat text corpora as knowledge bases for extracting information on demand?
In this talk I will focus on two problems related to this question. First, I will look at augmenting incomplete KBs with textual knowledge for question answering. I will describe a graph neural network model for processing heterogeneous data from the two sources. Next, I will describe a scalable approach for compositional reasoning over the contents of the text corpus, analogous to following a path of relations in a structured KB to answer multi-hop queries. I will conclude by discussing interesting future research directions in this domain.
Bhuwan Dhingra is a final year PhD student at Carnegie Mellon University, advised by William Cohen and Ruslan Salakhutdinov. His research uses natural language processing and machine learning to build an interface between AI applications and world knowledge (facts about people, places and things). His work is supported by the Siemens FutureMakers PhD fellowship. Prior to joining CMU, Bhuwan completed his undergraduate studies at IIT Kanpur in 2013, and spent two years at Qualcomm Research in the beautiful city of San Diego.