The Stanford Natural Language Processing Group

This talk is part of the NLP Seminar Series.

Towards Democratizing Data Science with AI-powered Knowledge Engines

Yu Su, Microsoft Semantic Machines
Date: 11:00 pm - 12:00 pm, Feb 7 2019
Venue: Room 104, Gates Computer Science Building

Abstract

Data-driven problem solving and decision making is ubiquitous in daily life. For example, doctors make diagnostic decisions by gathering information from patient inquiry and examination. The rise of big data, such as electronic medical records and digitized scientific literature, bears the promise of bringing unprecedented opportunities for better-informed decision making. However, as data becomes more and more massive and heterogeneous, standing in stark contrast to this promise is the rapidly growing gap between users and data: Accessing and analyzing even very simple data requires extensive training, which is not economical (or feasible) for casual users who only use data on an occasional and on-demand basis.

In this talk, I will discuss a possible solution, AI-powered knowledge engine, for bridging the gap between users and data. Three capabilities are essential: (1) Extracting structured, actionable knowledge from raw data, (2) querying the knowledge with natural language, (3) reasoning to derive new knowledge. I will discuss the key role of AI techniques, in particular deep learning with weak supervision, in supporting these capabilities. I will conclude the talk with discussion of research frontiers in this space.

Bio

Yu Su is currently a researcher at Microsoft Semantic Machines and will join the faculty of the Ohio State University. He got his PhD from University of California, Santa Barbara, and his bachelor degree from Tsinghua University. His research intersects data mining and natural language processing towards the overarching goal of democratizing data science, i.e., enabling non-technical users to enjoy data science capabilities, with the help of advanced AI techniques. His recent research interests include semantic parsing, dialogue systems, and knowledge bases. He has been regularly serving in premier data mining and natural language processing conferences in various roles, including co-organizer of the first workshop on Knowledge Base Construction, Reasoning and Mining. He has interned at Microsoft Research Redmond, IBM T.J. Watson Research Center, and U.S. Army Research Laboratory.