Category : Information Retrieval
时间: 12月27日 周四 下午10:00-12:00
题目: TextScope: Enhance Human Perception via Intelligent Text Retrieval and Mining
报告人: Professor ChengXiang Zhai
Recent years have seen a dramatic growth of natural language text data, including, e.g., web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. Text data contain all kinds of knowledge about the world and human opinions and preferences, thus offering great opportunities for mining actionable knowledge from vast amounts of text data (“big text data”) to support user tasks and optimize decision making in all application domains. However, computers cannot yet accurately understand unrestricted natural language; as such, how to analyze and mine big text data effectively and efficiently is a difficult challenge, and involving humans in a loop of interactive retrieval and mining of text data is essential. In this talk, I will present the vision of TextScope, an interactive software tool to enable users to perform intelligent information retrieval and text analysis in a unified task-support framework. Just as a microscope allows us to see things in the “micro world,” and a telescope allows us to see things far away, the envisioned TextScope would allow us to “see” useful hidden knowledge buried in large amounts of text data that would otherwise be unknown to us. As examples of techniques that can be used to build a TextScope, I will present some of our recent work on formal models for optimizing interactive information retrieval and general algorithms for analyzing text and non-text data jointly to discover interesting patterns and knowledge. At the end, I will discuss the major challenges in developing a TextScope and some important directions for future research.
ChengXiang Zhai is a Donald Biggar Willett Professor in Engineering of the Department of Computer Science at the University of Illinois at Urbana-Champaign (UIUC), where he is also affiliated with the Carl R. Woese Institute for Genomic Biology, Department of Statistics, and School of Information Sciences. He received a Ph.D. in Computer Science from Nanjing University in 1990, and a Ph.D. in Language and Information Technologies from Carnegie Mellon University in 2002. He worked at Clairvoyance Corp. as a Research Scientist and a Senior Research Scientist from 1997 to 2000. His research interests are in the general area of intelligent information systems, including specifically information retrieval, data mining, and their applications in many areas especially biomedical and health informatics, and intelligent education systems. He has published over 300 papers in these areas with high citations, and a textbook on text data management and analysis, which is used worldwide by many learners of the two MOOCs that he offered on Coursera. He served as Associate Editors for major journals in multiple areas including information retrieval (ACM TOIS, IPM), data mining (ACM TKDD), and medical informatics (BMC MIDM), and as Program Co-Chairs of ACM SIGIR 2009 and WWW 2015. He is an ACM Fellow and received a number of awards, including ACM SIGIR Test of Time Paper Award (three times), the 2004 Presidential Early Career Award for Scientists and Engineers (PECASE), an Alfred P. Sloan Research Fellowship, IBM Faculty Award, HP Innovation Research Award, and UIUC Campus Award for Excellence in Graduate Student Mentoring. More information about him and his work can be found from his homepage at http://czhai.cs.illinois.edu/.