Monthly Archives: 十二月 2018

  • 0

讲座信息:TextScope: Enhance Human Perception via Intelligent Text Retrieval and Mining

时间: 12月27日 周四 下午10:00-12:00
地点:信息楼123会议室
题目: TextScope: Enhance Human Perception via Intelligent Text Retrieval and Mining
报告人: Professor ChengXiang Zhai

ABSTRACT:

Recent years have seen a dramatic growth of natural language text data, including, e.g., web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. Text data contain all kinds of knowledge about the world and human opinions and preferences, thus offering great opportunities for mining actionable knowledge from vast amounts of text data (“big text data”) to support user tasks and optimize decision making in all application domains. However, computers cannot yet accurately understand unrestricted natural language; as such, how to analyze and mine big text data effectively and efficiently is a difficult challenge, and involving humans in a loop of interactive retrieval and mining of text data is essential. In this talk, I will present the vision of TextScope, an interactive software tool to enable users to perform intelligent information retrieval and text analysis in a unified task-support framework. Just as a microscope allows us to see things in the “micro world,” and a telescope allows us to see things far away, the envisioned TextScope would allow us to “see” useful hidden knowledge buried in large amounts of text data that would otherwise be unknown to us. As examples of techniques that can be used to build a TextScope, I will present some of our recent work on formal models for optimizing interactive information retrieval and general algorithms for analyzing text and non-text data jointly to discover interesting patterns and knowledge. At the end, I will discuss the major challenges in developing a TextScope and some important directions for future research.

BIO:

ChengXiang Zhai is a Donald Biggar Willett Professor in Engineering of the Department of Computer Science at the University of Illinois at Urbana-Champaign (UIUC), where he is also affiliated with the Carl R. Woese Institute for Genomic Biology, Department of Statistics, and School of Information Sciences. He received a Ph.D. in Computer Science from Nanjing University in 1990, and a Ph.D. in Language and Information Technologies from Carnegie Mellon University in 2002. He worked at Clairvoyance Corp. as a Research Scientist and a Senior Research Scientist from 1997 to 2000. His research interests are in the general area of intelligent information systems, including specifically information retrieval, data mining, and their applications in many areas especially biomedical and health informatics, and intelligent education systems.  He has published over 300 papers in these areas with high citations, and a textbook on text data management and analysis, which is used worldwide by many learners of the two MOOCs that he offered on Coursera. He served as Associate Editors for major journals in multiple areas including information retrieval (ACM TOIS, IPM), data mining (ACM TKDD), and medical informatics (BMC MIDM), and as Program Co-Chairs of ACM SIGIR 2009 and WWW 2015. He is an ACM Fellow and received a number of awards, including ACM SIGIR Test of Time Paper Award (three times), the 2004 Presidential Early Career Award for Scientists and Engineers (PECASE), an Alfred P. Sloan Research Fellowship, IBM Faculty Award, HP Innovation Research Award, and UIUC Campus Award for Excellence in Graduate Student Mentoring. More information about him and his work can be found from his homepage at http://czhai.cs.illinois.edu/.

 


  • 0

讲座信息:Fast Euclidean OPTICS with Bounded Precision in Low Dimensional Space

Category : Visualization

时间: 12月20日 周四 下午14:00-16:00
地点:信息楼123会议室
题目: Fast Euclidean OPTICS with Bounded Precision in Low Dimensional Space
报告人: DR. JunHao Gan

ABSTRACT:

OPTICS is a popular method for visualizing multidimensional clusters. Despite of the popularity of this method, somewhat surprisingly, the term of “valley”, which is used to capture clusters in the resulted visualizations, has been used on an intuitive basis throughout the literature. Moreover, all the existing implementations of OPTICS have a time complexity of O(n^2) — where n is the size of the input dataset — and thus, may not be suitable for datasets of large volumes.

In this talk, we will first formalize the concept of “valley”, by which rigorous measurement on the resemblance of two valleys becomes possible, and it lays down a foundation to alleviate the problem of computing OPTICS visualizations by resorting to approximation with guarantees. Then, we will show an algorithm that runs in O(n log n) time under any fixed dimensionality, and computes a visualization that has provably small discrepancies from that of OPTICS.

BIO:

Dr Junhao Gan joined the School of Computing and Information Systems (CIS) at the University of Melbourne (UoM) as a lecturer (equivalent to Assistant Professor in US) in August 2018. Prior to that, he worked as a post-doctoral research fellow in the School of Information Technologies and Electrical Engineering (ITEE) at the University of Queensland (UQ) from 2017 to 2018, and obtained his PhD degree under the supervision of Prof. Yufei Tao in the same school at UQ in 2017. His research interests are in practical algorithms with non-trivial theoretical guarantees, especially algorithms for solving problems on massive data. Dr Gan has published 5 papers at SIGMOD, one journal at TODS and one article at JGAA. One of his papers won the Best Paper Award at SIGMOD 2015 which is considered as one of the highest honours in the database area, and his PhD thesis was awarded the CORE John Makepeace Bennett Award (Australasian Distinguished Doctoral Dissertation) 2018 that is presented to the best PhD thesis over all the areas in computer science finalised during the year in Australasia. Besides, Dr Gan also won the Discovery Early Career Research Award (DECRA) 2019 from Australian Research Council (ARC) which is one of the most competitive fellowships for early career researchers in Australia.