Invited Speeches

Prof. Yixin Chen
Title: Interplay between machine learning and AI.
Abstract: In the era of big data, we need novel algorithms on top of the supporting platform. In this talk, I will focus on the interaction between machine learning algorithms for big data and traditional artificial intelligence techniques including graph search and planning. In particular, I will discuss two works. The first one applies manifold learning and dimensionality reduction algorithms to speedup graph search and automated planning. The second one applies graph search to solve submodular optimization problems arising from machine learning contexts. These works shed new insights into the deep connection between machine learning and AI.

Bio.
Yixin Chen is an Associate Professor of Computer Science at the Washington University in St. Louis. His research interests include data mining, machine learning, artificial intelligence, and optimization. He received a Ph.D. in Computing Science from the University of Illinois at Urbana-Champaign in 2005. He received the Best Student Paper Runner-up Award at the ACM SIGKDD Conference (2014), Best Paper Award at the AAAI Conference on Artificial Intelligence (2010), and the International Conference on Tools for AI (2005). His work on planning has won First Prizes in the International Planning Competitions (2004 & 2006).

He received an Early Career Principal Investigator Award from the Department of Energy (2006) and a Microsoft Research New Faculty Fellowship (2007). His research has been funded by NSF, NIH, DOE, Microsoft, and Memorial Sloan-Kettering Cancer Center. He is an Associate Editor for ACM Transactions of Intelligent Systems and Technology and serves on the Editorial Board of Journal of Artificial Intelligence Research.

Prof. Huan Liu
Title: Discovering Negative Links on Social Networking Sites.
Abstract: Social networking sites make it easy for users to connect with, follow, or "like" each other. Such a mechanism promotes positive connections and helps a social networking site to grow without direct belligerent or negative encounters. This type of one-way connections makes no distinction between indifference and dislike; in other words, two users have only, by default, positive connections. However, it is apparent that as one's network grows, some users might not be benevolent toward each other, or negative links could form, though not explicitly stated. In this talk, we assess the need for discovering such hidden negative links, explore ways of finding negative links, and show the significance of negative links in social media applications like data classification and clustering, recommendation systems, link prediction, and tie-strength estimation. *This presentation is based on Dr. Jiliang Tang's Doctoral Dissertation at ASU.

Bio.
Dr. Huan Liu is a professor of Computer Science and Engineering at Arizona State University. He obtained his Ph.D. in Computer Science at University of Southern California and B.Eng. in EECS at Shanghai JiaoTong University. He was recognized for excellence in teaching and research in Computer Science and Engineering at Arizona State University. His research interests are in data mining, machine learning, social computing, social media mining, and artificial intelligence, investigating problems that arise in real-world applications with high-dimensional data of disparate forms. His well-cited publications include books, book chapters, encyclopedia entries as well as conference and journal papers. He serves on journal editorial/advisory boards and numerous conference program committees. He is a Fellow of IEEE.

Prof. Ramamohanarao Kotagiri
Title: Large Scale Metric Learning using Locality Sensitive Hashing.
Abstract:
Metric learning tries discover mapping of features such that objects belonging a particular class each other in the new space. However, the current methods of discovering such matric mappings are computationally in feasible when the data set is huge with large number of features. My talk will describe the state of the art algorithms for metric learning. I will present our recent work on an efficient approach for discovering metric learning based   mappings using Locality Sensitive Hashing (LSH). Our generic approach can accelerate state-of-the-art metric learning while achieving competitive classification accuracy, expanding feasibility by an order of magnitude. Our approach can accelerate Large Margin Nearest Neighbour (LMNN) to learn metrics on 1,000,000 samples in 3.6 minutes which is reduced from 5.8 hours.

Bio. Professor Ramamohanarao (Rao) Kotagiri received PhD from Monash University. He was awarded the Alexander von Humboldt Fellowship in 1983.  He has been at the University Melbourne since 1980 and was appointed as a professor in computer science in 1989. Rao held several senior positions including Head of Computer Science and Software Engineering, Head of the School of Electrical Engineering and Computer Science at the University of Melbourne and Research Director for the Cooperative Research Centre for Intelligent Decision Systems.  He served on the Editorial Boards of the Computer Journal Universal Computer Science, IEETKDE and VLDB (Very Large Data Bases) Journal. He was the program Co-Chair for VLDB, PAKDD, DASFAA and DOOD conferences.  He is a steering committee member of IEEE ICDM, PAKDD and DASFAA. He received Distinguished Contribution Award by PAKDD for Data Mining; Distinguished Contribution Award in 2009 by the Computing Research and Education Association of Australasia; Distinguished Contribution Award by DASFAA for Database Research; Distinguished Service Award by IEEE ICDM for Data Mining.  Rao is a Fellow of the Institute of Engineers Australia, a Fellow of Australian Academy Technological Sciences and Engineering and a Fellow of Australian Academy of Science.

Prof. Jian Pei
Title: Big Data for Everyone.
Abstract: Big Data post grand opportunities and challenges for egocentric analytics on Big Data. In this talk, I will discuss several interesting problems centered on egocentric queries and analysis on Big Data. We want to answer a series of natural questions imperative in several killer applications, such as "How is this patient similar to or different from the other Type II diabetes patients in the database?", "How is University X distinct from the other universities?", and "How is this residential property distinct from the others available in the market?" To answer such questions on Big Data, we have to search data of high dimensionality and high volume, and possibly of high dynamics as well. I will present some preliminary research results and some application case studies we obtained recently, as well as more challenges we identified.

Bio.
Jian Pei is currently the Canada Research Chair (Tier 1) in Big Data Science and a professor at the School of Computing Science and the Department of Statistics and Actuarial Science at Simon Fraser University, Canada. He received his Ph.D. degree at the same school in 2002 under Dr. Jiawei Han's supervision. His research interests are to develop effective and efficient data analysis techniques for novel data intensive applications. He has published prolifically and is one of the top cited authors in data mining. He received a series of prestigious awards. He is also active in providing consulting service to industry and transferring the research outcome in his group to industry and applications. He is an editor of several esteemed journals in his areas and a passionate organizer of the premier academic conferences defining the frontiers of the areas. He is an IEEE Fellow.

Prof. Yong Shi
Title: Big Data Mining and Data Science.
Abstract: Big Data has become a reality that no one can ignore. Big Data is our environment whenever we need to make a decision. Big Data is a buzz word that makes everyone understands how important it is. Big Data shows a big opportunity for academia, industry and government. Big Data then is a big challenge for all parties. This talk will discuss some fundamental issues of Big Data problems, such as data heterogeneity vs. decision heterogeneity, data stream research and data-driven decision management. Furthermore, this talk will provide a number of real-life Big Data Applications. In the conclusion, the talk suggests a number of open research problems in Data Science, which is a growing field beyond Big Data.

Bio.
Yong Shi, serves as the Executive Deputy Director, Chinese Academy of Sciences Research Center on Fictitious Economy & Data Science and the Director of the Key Lab of Big Data Mining and Knowledge Management, Chinese Academy of Sciences. He has been Union Pacific Chair of Information Science and Technology, University of Nebraska at Omaha, USA. Dr. Shi's research interests include business intelligence, data mining, and multiple criteria decision making. He has published more than 23 books, over 300 papers in various journals and numerous conferences/proceedings papers. He is the Editor-in-Chief of International Journal of Information Technology and Decision Making (SCI), Editor-in-Chief of Annals of Data Science (Springer) and a member of Editorial Board for a number of academic journals. Dr. Shi has received many distinguished awards including the Georg Cantor Award of the International Society on Multiple Criteria Decision Making (MCDM), 2009; Fudan Prize of Distinguished Contribution in Management, Fudan Premium Fund of Management, China, 2009; Outstanding Young Scientist Award, National Natural Science Foundation of China, 2001; and Speaker of Distinguished Visitors Program (DVP) for 1997-2000, IEEE Computer Society. He has consulted or worked on business projects for a number of international companies in data mining and knowledge management.

Prof. Geoff Webb
Title: Scaling log-linear analysis to datasets with thousands of variables.
Abstract:Association discovery is a fundamental data mining task. The primary statistical approach to association discovery between variables is log-linear analysis. Classical approaches to log-linear analysis do not scale beyond about ten variables. By melding the state-of-the-art in statistics, graphical modeling, and data mining research, we have developed efficient and effective algorithms for log-linear analysis, performing in seconds log-linear analysis of datasets with thousands of variables and providing a powerful statistically-sound method for creating compact models of complex high-dimensional multivariate distributions.

Bio.
Geoff Webb is a leading data scientist. He was editor in chief of the premier data mining journal, Data Mining and Knowledge Discovery from 2005 to 2014. He has been Program Committee Chair of the two top data mining conferences, ACM SIGKDD and IEEE ICDM, as well as General Chair of ICDM. He is the Director of the Monash University Center for Data Science. He is a Technical Advisor to BigML Inc, who are incorporating his best of class association discovery software, Magnum Opus, into their cloud based Machine Learning service. He developed many of the key mechanisms of support-confidence association discovery in the late 1980s.  His OPUS search algorithm remains the state-of-the-art in rule search. He pioneered multiple research areas as diverse as black-box user modelling, interactive data analytics and statistically-sound pattern discovery.  He has developed many useful machine learning algorithms that are widely deployed.  He received the 2013 IEEE Outstanding Service Award, a 2014 Australian Research Council Discovery Outstanding Researcher Award and was elevated to IEEE Fellow earlier this year.


Prof. Wei Wang
Title: Algorithm acceleration for high throughout biology.
Abstract: High throughput sequencing technique has been demonstrated as a revolutionary means for modern biology because it provides deep coverage and base pair-level resolution. It produces vast amount of data which pose new computational challenges, because subsequent analyses often rely on a sequence alignment step that re-establishes the origin of each read, a process that is both time consuming and error prone. In this talk, we will present our latest accomplishment in algorithm advances that dramatically accelerate the analysis by removing the necessity of sequence alignment. We will demonstrate through a concrete example of RNASeq quantification, in which we are able to achieve two orders of magnitude speedup and deliver competitive accuracy.

Bio.
Wei Wang is a professor in the Department of Computer Science at University of California at Los Angeles and the director of the Scalable Analytics Institute (ScAi). She received her PhD degree in Computer Science from the University of California at Los Angeles in 1999. Dr. Wang's research interests include big data, data mining, bioinformatics and computational biology, and databases. She has filed seven patents, and has published one monograph and more than one hundred research papers in international journals and major peer-reviewed conference proceedings. Dr. Wang received the IBM Invention Achievement Awards in 2000 and 2001. She was the recipient of a UNC Junior Faculty Development Award in 2003 and an NSF Faculty Early Career Development (CAREER) Award in 2005. She was named a Microsoft Research New Faculty Fellow in 2005. She was honored with the 2007 Phillip and Ruth Hettleman Prize for Artistic and Scholarly Achievement at UNC. She was recognized with an IEEE ICDM Outstanding Service Award in 2012 and an Okawa Foundation Research Award in 2013. Dr. Wang has been an associate editor of the IEEE Transactions on Knowledge and Data Engineering, ACM Transactions on Knowledge Discovery in Data, Journal of Knowledge and Information Systems, Journal of Data Mining and Knowledge Discovery, International Journal of Knowledge Discovery in Bioinformatics, and an editorial board member of the International Journal of Data Mining and Bioinformatics and the Open Artificial Intelligence Journal. She serves on the organization and program committees of international conferences including ACM SIGMOD, ACM SIGKDD, ACM BCB, VLDB, ICDE, EDBT, ACM CIKM, IEEE ICDM, SIAM DM, SSDBM, BIBM.

Prof. Hui Xiong
Title: Big Data Analytics in Business Environments.
Abstract: Recent years have witnessed the big data movement throughout all the business sectors. As a result, awareness of the importance of data mining for business is becoming wide spread. However, the big data are usually immense, fine-grained, diversified, dynamic, and sufficiently information-rich in nature, and thus demand a radical change in the philosophy of data analytics. In this talk, we introduce a set of scenarios for understanding and mining of business data in various business sectors. In particular, we will discuss the technical and domain challenges of big data analytics in business environments. The theme to be covered will include (1) the data mining problem formulation in different business applications; (2) the challenging issues of data pre-processing and post-processing in business analytics; (3) how the underlying computational models can be adapted for managing the uncertainties in relation to big data process in a huge nebulous business environment. Finally, we will also show some promising research directions.

Bio.
Dr. Hui Xiong is currently a Full Professor and the Director of Rutgers Center for Information Assurance at Rutgers, the State University of New Jersey, where he received a two-year early promotion/tenure (2009), the Rutgers University Board of Trustees Research Fellowship for Scholarly Excellence (2009), and the ICDM-2011 Best Research Paper Award (2011). Dr. Xiong received his Ph.D. in Computer Science from the University of Minnesota (UMN), USA, in 2005, the B.E. degree in Automation from the University of Science and Technology of China (USTC), China, and the M.S. degree in Computer Science from the National University of Singapore (NUS), Singapore. His general area of research is data and knowledge engineering, with a focus on developing effective and efficient data analysis techniques for emerging data intensive applications. He has published prolifically in refereed journals and conference proceedings (3 books, 60+ journal papers, and 80+ conference papers). He is the co-Editor-in-Chief of Encyclopedia of GIS by Springer, and an Associate Editor of IEEE Transactions on Knowledge and Data Engineering (TKDE), ACM Transactions on Management Information Systems, as well as IEEE Transactions on Big Data. He has served regularly on the organization and program committees of numerous conferences, including as a Program Co-Chair of the Industrial and Government Track for the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), a Program Co-Chair for the IEEE 2013 International Conference on Data Mining (ICDM-2013), and a General Co-Chair for the IEEE 2015 International Conference on Data Mining (ICDM-2015). He is an ACM Distinguished Scientist.

Prof. Philip Yu
Title: On Mining Heterogeneous Information Networks.
Abstract: The problem of big data has become increasingly importance in recent years. On the one hand, the big data is an asset that potentially can offer tremendous value or reward to the data owner. On the other hand, it poses tremendous challenges to distil the value out of the big data. The very nature of the big data poses challenges not only due to its volume, and velocity of being generated, but also its variety, where variety means the data can be collected from various sources with different formats from structured data to text to network/graph data, etc. In this talk, we focus on the variety issue and discuss the recent development in mining of heterogeneous information networks which can be applied to multiple disciplines, including social network analysis, World-Wide Web, database systems, data mining, machine learning, and networked communication and information systems. We will examine the problem of integration of multiple data sources using heterogeneous information network models. Fusion of multiple social networks will also be considered.

Bio.
Dr. Philip S. Yu is a Distinguished Professor and the Wexler Chair in Information Technology at the Department of Computer Science, University of Illinois at Chicago. Before joining UIC, he was at the IBM Watson Research Center, where he built a world-renowned data mining and database department. He is a Fellow of ACM and IEEE. Dr. Yu is the recipient of IEEE Computer Society’s 2013 Technical Achievement Award for “pioneering and fundamentally innovative contributions to the scalable indexing, querying, searching, mining and anonymization of big data”. Dr. Yu has published more than 870 referred conference and journals papers cited more than 62,000 times with an H-index of 117. He has applied for more than 300 patents.

Dr. Yu is the Editor-in-Chief of ACM Transactions on Knowledge Discovery from Data.  He is on the steering committee of the IEEE Conference on Data Mining and ACM Conference on Information and Knowledge Management and was a member of the IEEE Data Engineering steering committee.  He was the Editor-in-Chief of IEEE Transactions on Knowledge and Data Engineering (2001-2004). He received a Research Contributions Award from IEEE Intl. Conference on Data Mining (ICDM) in 2003, the ICDM 2013 10-year Highest-Impact Paper Award, and the EDBT Test of Time Award (2014). Dr. Yu received his PhD from Stanford University.

Prof. Albert Zomaya
Title: Resource Management in Cloud Computing Systems
Abstract: The cloud is well known for its elasticity by leveraging abundant resources. Cloud data centres easily host thousands or even millions of multicore servers. Further, these servers are increasingly virtualized for the sake of data centre efficiency. However, the reality is that these resources are often relentlessly exploited particularly to improve applications performance. Although the elasticity facilitates achieving cost efficiency (or the performance to cost ratio), the ultimate efficiency in resource usage (or more broadly data centres) lies in scheduling and resource allocation strategies that explicitly take into account actual resource consumption. The optimization of resource efficiency in clouds is of great practical importance considering its numerous benefits in the economic and environmental sustainability. In this talk, we will discuss resource efficiency in cloud data centres with an example of large-scale distributed processing applications including scientific workflows and MapReduce jobs.

Bio.
ALBERT Y. ZOMAYA is the Chair Professor of High Performance Computing & Networking and Australian Research Council Professorial Fellow in the School of Information Technologies, Sydney University. He is also the Director of the Centre for Distributed and High Performance Computing which was established in late 2009.

Dr. Zomaya published more than 500 scientific papers and articles and is author, co-author or editor of more than 20 books. He served as the Editor in Chief of the IEEE Transactions on Computers (2011-2014) and currently serves as Editor in Chief of Springer's Scalable Computing. He also serves as an associate editor for 22 leading journals and is the Founding Editor of the Wiley Book Series on Parallel and Distributed Computing.

Dr. Zomaya was the Chair the IEEE Technical Committee on Parallel Processing (1999–2003) and currently serves on its executive committee. He is the Vice–Chair, IEEE Task Force on Computational Intelligence for Cloud Computing and serves on the advisory board of the IEEE Technical Committee on Scalable Computing and the steering committee of the IEEE Technical Area in Green Computing.

Dr. Zomaya has delivered more than 150 keynote addresses, invited seminars, and media briefings and has been actively involved, in a variety of capacities, in the organization of more than 600 conferences.

Dr. Zomaya is the recipient of the IEEE Technical Committee on Parallel Processing Outstanding Service Award (2011), the IEEE Technical Committee on Scalable Computing Medal for Excellence in Scalable Computing (2011), and the IEEE Computer Society Technical Achievement Award (2014). He is a Chartered Engineer, a Fellow of AAAS, IEEE, IET (UK). Professor Zomaya's research interests are in the areas of parallel and distributed computing and complex systems.

Prof. Yangyong Zhu
Title: Defining Data Science.
Abstract:In the age of big data, data science has become a hot occupation, supplanting traditional information science and big data engineering. This may indicate that data science has become its own branch of research. The term “data science” first appeared in CODATA Data Science Journal in 1990. So far, it has had several different interpretations. This talk aims to address what goals data science should seek to meet, and what data science itself is. We will present key connotations of data science: the first is the study of data itself. Its goal is to explore datanature and scientific issues related to datanature. The second is the study of the rules of the natural world as reflected by data, i.e., the study the natural world performed through the study of data.

Bio.
Yangyong Zhu is a Professor of Computer Science in Fudan University, Shanghai, China, and Director of Shanghai Key Laboratory of Data Science. He received a Ph.D. degree in Computer and Software Theory from Fudan University, China, in 1994. His research interests include data science, big data. Since 1989, he engaged in data research and became pioneers in data mining research. He is also a leading exponent of data science. In 2009, he published the paper titled Data Explosion, Data Nature and Dataology and the monograph Dataology and Data Science

Prof. Zhi-Hua Zhou
Title: Learning with Big Data by Incremental Optimization of Performance Measures
Abstract: A popular approach to achieve a strong learning system is to take the performance measure that will be used for evaluation as an optimization target, and then accomplish the learning task by an optimization procedure. Many performance measures in machine learning, however, are unfortunately non-linear, non-smooth and non-convex, leading to difficult optimization problems. With big data, the optimization becomes even more challenging because of the concerns of computational, storage, communication costs, etc. Particularly, it becomes almost impossible to collect all data at first and then perform optimization, and it is desired to be able to optimize performance measures incrementally, without accessing the whole data. In this talk we will introduce some studies along this direction.

Bio. Zhi-Hua Zhou is a Professor of Nanjing University. His research interests are mainly in machine learning, data mining and pattern recognition. He authored the book "Ensemble Methods: Foundations and Algorithms", and published more than 100 papers in top-tier journals and conference proceedings. According to GoogleScholar, his papers have received more than 16,000 citations. He has received various awards, including the National Natural Science Award of China, the IEEE CIS Outstanding Early Career Award, the Microsoft Professorship Award, etc. He serves as the Executive Editor-in-Chief of Frontiers of Computer Science, Associate Editor-in-Chief of Science China: Information Science, and Associate Editor of the ACM TIST, IEEE TNNLS, etc. He is the founder of the ACML, Advisory Committee member and machine learning track co-chair of IJCAI 2015, Program co-chair of ICDM 2015, etc. He is an ACM Distinguished Scientist, IEEE Fellow, IAPR Fellow and CCF Fellow.