学术报告
题 目:Randomized Algorithms with Sparse Kernel Weights for Large-scale Multiple Kernel Dimensionality Reduction and Clustering
报 告 人:吴钢 教授 (邀请人:黎稳 )
中国矿业大学
时 间:6月26日 09:00-10:00
地 点:数科院西楼111报告厅
报告人简介:
吴钢,博士、中国矿业大学数学学院教授、博士生导师;江苏省“333 工程” 中青年科学技术带头人,江苏省“青蓝工程”中青年学术带头人,江苏省计算数学学会副理事长。主要研究方向:数值代数、机器学习与数据挖掘、大数据与人工智能中的快速算法等。目前已主持国家自然科学基金项目4项、省自然科学基金项目3项、市厅重点研发计划1项,在国际知名杂志,如:SIAM Journal on Numerical Analysis, SIAM Journal on Matrix Analysis and Applications, SIAM Journal on Scientific Computing, IMA Journal of Numerical Analysis, IEEE Transactions on Knowledge and Data Engineering, Pattern Recognition, Machine Learning, ACM Transactions on Information Systems 等期刊发表学术论文多篇。
摘 要:
Kernel method is a popular nonlinear machine learning technique. The most suitable kernel function for a particular learning task is often unknown in advance, and choosing or constructing proper kernel function is a very challenging task. Multiple kernel learning method is a commonly used skill to deal with this problem. However, to the best of our knowledge, most of the existing multiple kernel learning methods need to form and store the base and the ensemble kernel matrices explicitly, which may suffer from huge amount of computation and storage requirements, especially for large-sample problems. The main contributions of the work are as follows. First, we propose randomized multiple kernel learning methods, in which there are no need to form nor store the kernel matrices explicitly. Second, to further reduce the amount of data transfer and redundant information, we introduce relatively sparse kernel weight coefficients by evaluating correlation between the base kernel matrices. Based on this strategy, we propose a new approach for selecting representative kernels in multiple kernel learning. Third, we apply the proposed strategies to two representative multiple kernel learning including multiple kernel dimensionality reduction and multiple kernel clustering. The proposed strategies are also suitable to other multiple kernel learning approaches. Comprehensive numerical experiments demonstrate the numerical behavior of the proposed methods in benchmark data sets for supervised, unsupervised as well as semi-supervised multiple kernel learning problems. Numerical results show that our methods are often superior to some state-of-the-art multiple kernel learning methods for dimensionality reduction and clustering.