Details of Grant

EPSRC Reference:

EP/V008331/1

Title:

Deep Learning for Time-Inconsistent Dynamic Optimization

Principal Investigator:

Zheng, Professor H

Other Investigators:

Researcher Co-Investigators:

Project Partners:

Department:

Mathematics

Organisation:

Imperial College London

Scheme:

Standard Research

Starts:

01 January 2021

Ends:

31 December 2023

Value (£):

464,268

EPSRC Research Topic Classifications:

Mathematical Aspects of OR

EPSRC Industrial Sector Classifications:

Information Technologies

Related Grants:

Panel History:

Panel Date	Panel Name	Outcome
31 Aug 2020	EPSRC Mathematical Sciences Prioritisation Panel September 2020	Announced

Summary on Grant Application Form

The proposed research is to solve a so called time-inconsistent (TI) dynamic optimization problem that addresses decision making in the presence of inconsistent and often conflicting human behaviour, e.g., long term health benefit of stopping smoking vs instant pleasure of nicotine cravings. Solving TI dynamic optimization can have far-reaching impact from consumer behaviour to social welfare policy. The decision making under the TI framework is completely different from ones in standard optimization and economic theory under the rational behaviour assumption. The results for TI dynamic optimization are few and far between. The main bottleneck is computation due to the requirement of solving the system of high dimensional nonlinear partial differential equations and forward-backward stochastic differential equations. The project is to develop the fundamental theory and novel methodology to solve TI dynamic optimization by integrating the deep reinforcement learning (DRL)} from data science with advanced mathematical theories such as convex analysis, dual stochastic control, etc. The breakthrough in solving TI dynamic optimization can make great impact in applications. One example is asset allocation, many financial institutions use one-period mean variance (MV) model, which is simple to use but has many drawbacks. A multi-period or continuous time model is more realistic for stochastic asset price processes and fits better the dynamic nature of the economy, but is TI and difficult to solve. The findings of the project can help solve continuous time MV problems that would improve financial asset liability management and performance, which in turn would have great impact on societal prosperity and individual well-being. In short, progress in TI dynamic optimization and DRL computation can greatly help industry and government agencies to improve decision making and design more efficient and powerful computational software for real-world TI problems, based on the DRL solver developed in the project.

Key Findings

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Potential use in non-academic contexts

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Impacts

Description	This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised

Sectors submitted by the Researcher

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Project URL:

Further Information:

Organisation Website:

http://www.imperial.ac.uk