学术报告
题 目:Towards Gradient-based Bilevel Optimization in Machine Learning
报 告 人:张进 副教授 (邀请人:陈艳男 )
南方科技大学数学系和深圳国家应用数学中心
时 间:3月10日 14:00-15:00
地 点:数科院东楼401
报告人简介:
南方科技大学数学系副教授,深圳国家应用数学中心协理副主任,国家优青、广东省杰青、深圳市优青。2007、2010年本科、硕士毕业于大连理工大学,2014年博士毕业于加拿大维多利亚大学。2015至2018年间任职香港浸会大学,2019年初加入南方科技大学。致力于最优化理论和应用研究,代表性成果发表在Math Program、SIAM J Optim、Math Oper Res、SIAM J Numer Anal、J Mach Learn Res、IEEE Trans Pattern Anal Mach Intell,以及ICML、NeurIPS等有重要影响力的最优化、计算数学、机器学习期刊与会议上。研究成果获得2020年中国运筹学会青年科技奖、2022年广东省青年科技创新奖,主持国家自然科学基金/广东省自然科学基金/深圳市科创委/香港研究资助局面上项目。
摘 要:
Recently, Bi-Level Optimization (BLO) techniques have received extensive attentions from machine learning communities. In this talk, we will discuss some recent advances in the applications of BLO. First, we study a gradient-based bi-level optimization method for learning tasks with convex lower level. In particular, by formulating bi-level models from the optimistic viewpoint and aggregating hierarchical objective information, we establish Bi-level Descent Aggregation (BDA), a flexible and modularized algorithmic framework for bi-level programming. Second, we focus on a variety of BLO models in complex and practical tasks are of non-convex follower structure in nature. In particular, we propose a new algorithmic framework, named Initialization Auxiliary and Pessimistic Trajectory Truncated Gradient Method (IAPTT-GM), to partially address the lower level non-convexity. By introducing an auxiliary as initialization to guide the optimization dynamics and designing a pessimistic trajectory truncation operation, we construct a reliable approximation to the original BLO in the absence of lower level convexity hypothesis. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed BDA and IAPTT-GM for different tasks, including hyper-parameter optimization and meta learning.