EPSRC Reference: |
EP/V008331/1 |
Title: |
Deep Learning for Time-Inconsistent Dynamic Optimization |
Principal Investigator: |
Zheng, Professor H |
Other Investigators: |
|
Researcher Co-Investigators: |
|
Project Partners: |
|
Department: |
Mathematics |
Organisation: |
Imperial College London |
Scheme: |
Standard Research |
Starts: |
01 January 2021 |
Ends: |
31 December 2023 |
Value (£): |
464,268
|
EPSRC Research Topic Classifications: |
Mathematical Aspects of OR |
|
|
EPSRC Industrial Sector Classifications: |
|
Related Grants: |
|
Panel History: |
|
Summary on Grant Application Form |
The proposed research is to solve a so called time-inconsistent (TI) dynamic optimization problem that addresses decision making in the presence of inconsistent and often conflicting human behaviour, e.g., long term health benefit of stopping smoking vs instant pleasure of nicotine cravings. Solving TI dynamic optimization can have far-reaching impact from consumer behaviour to social welfare policy. The decision making under the TI framework is completely different from ones in standard optimization and economic theory under the rational behaviour assumption. The results for TI dynamic optimization are few and far between. The main bottleneck is computation due to the requirement of solving the system of high dimensional nonlinear partial differential equations and forward-backward stochastic differential equations. The project is to develop the fundamental theory and novel methodology to solve TI dynamic optimization by integrating the deep reinforcement learning (DRL)} from data science with advanced mathematical theories such as convex analysis, dual stochastic control, etc. The breakthrough in solving TI dynamic optimization can make great impact in applications. One example is asset allocation, many financial institutions use one-period mean variance (MV) model, which is simple to use but has many drawbacks. A multi-period or continuous time model is more realistic for stochastic asset price processes and fits better the dynamic nature of the economy, but is TI and difficult to solve. The findings of the project can help solve continuous time MV problems that would improve financial asset liability management and performance, which in turn would have great impact on societal prosperity and individual well-being. In short, progress in TI dynamic optimization and DRL computation can greatly help industry and government agencies to improve decision making and design more efficient and powerful computational software for real-world TI problems, based on the DRL solver developed in the project.
|
Key Findings |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Potential use in non-academic contexts |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Impacts |
Description |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk |
Summary |
|
Date Materialised |
|
|
Sectors submitted by the Researcher |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Project URL: |
|
Further Information: |
|
Organisation Website: |
http://www.imperial.ac.uk |