学术报告
题 目:Randomized Optimal Stopping Problem in Continuous time and reinforcement learning algorithm
报 告 人:董玉超 副教授 (邀请人:杨舟 )
同济大学
时 间:6月20日 10:00-11:00
地 点:数科院西楼会议室
报告人简介:
董玉超博士毕业于复旦大学数学科学学院,之后在复旦大学,法国昂热大学,新加坡国立大学从事博士后研究。2021年1月加入同济大学数学科学学院。董玉超博士的研究方向为随机最优控制理论及其在金融数学中的应用。其研究工作发表在包括AMO,SICON,SIAP,MaFi等国际知名期刊上。
摘 要:
In this paper, we study the optimal stopping problem in the so-called exploratory framework, in which the agent takes actions randomly conditioning on current state and an entropy-regularized term is added to the reward functional. Such a transformation reduces the optimal stopping problem to a standard optimal control problem. For the American put option model, we derive the related HJB equation and prove its solvability. Furthermore, we give a convergence rate of policy iteration and compare our solution to the classical American put option problem. Our results indicate a balance between the convergence rate and bias in the choice of the temperature constant. Based on the theoretical analysis, a reinforcement learning algorithm is designed and numerical results are demonstrated for several models.