Details of Grant

EPSRC Reference:

EP/T000783/1

Title:

MIMIc: Multimodal Imitation Learning in MultI-Agent Environments

Principal Investigator:

De Silva, Dr VD

Other Investigators:

Researcher Co-Investigators:

Project Partners:

Chelsea Football Club Academy

Department:

Loughborough University in London

Organisation:

Loughborough University

Scheme:

New Investigator Award

Starts:

02 December 2019

Ends:

01 December 2021

Value (£):

258,875

EPSRC Research Topic Classifications:

Artificial Intelligence

EPSRC Industrial Sector Classifications:

Sports and Recreation

Related Grants:

Panel History:

Panel Date	Panel Name	Outcome
02 May 2019	EPSRC ICT Prioritisation Panel May 2019	Announced

Summary on Grant Application Form

In UK, we are not allowed to drive a vehicle until we are 17. It is because, driving is a complex and safety critical activity that requires many advanced cognitive skills like recognition of possible threats, anticipation of behavior of other road users and agile reaction to emerging situations. Think about a football player making decisions on field. A good player can sense the opportunities, through anticipating what other players will do, and select an action that will increase the odds of scoring. It takes a long time for humans to develop these advanced cognitive skills, to become an expert at such complex real-world tasks. Artificial Intelligence has made significant progress during the last decade, demonstrated by breakthroughs in cancer detection, computers beating 'Go' masters and intelligent robotics. However, if AI is to live up to its science fictional promises to assist humanity or even supersede human intelligence, it should at least be equipped with cognitive skills such as those possessed by humans. This project aims to develop ground breaking algorithms that equip autonomous systems with human like cognitive skills required to thrive in real world environments.

We are focused on applications that require autonomous agents (e.g. Robot or Driverless car) to interact with multiple intelligent agents in the environment to accomplish a task (known as Multi-Agent Environments: MAEs). Such applications require an agent to anticipate the behaviour of other agents and to select the most appropriate course of actions. Equipping agents with such autonomous decision-making capability is known as policy learning. Compared to policy learning in single agent domains (teaching a robot to walk or a computer to play a video game), the recent progress of policy learning in MAEs has been quite modest. This is due to multiple reasons: 1)Due to agent actions the environment is dynamic 2)multi-agent policy learning suffers from a theoretical limitation known as curse of dimensionality (CoD) 3)Utility functions that capture agent objectives are difficult to define 4)there is a significant lack of adequate multi-agent datasets that allow meaningful research. This project proposes to undertake research in to policy learning in MAEs, by addressing the above limitations.

Our unique approach to policy learning in MAEs is motivated by how humans thrive in similar settings. Firstly, we perceive the world through multiple senses, (i.e. vision, audition, touch) enabling a rich perception of the world. Secondly, when acting in a MAE, humans do not pay attention to all the stimuli but only to key stimuli e.g. when a football player is attacking the ball, the player pays attention only to the teammates capable of effecting a goal and the key defenders. Finally, the learning paradigm we employ known as imitation learning is an emerging methodology to learn by observing experts, which is a productive approach that we use to learn new skills. Accordingly, we propose to learn realistic policies in MAEs through imitation learning by leveraging multimodal data fusion and selective-attention modelling. Multimodal data fusion allows to capture high dimensional context of the real world and selective attention model allows for allaying the issue of CoD. We have been provided a unique multimodal multi-agent dataset and access to state-of-the-art facilities to capture data, by an elite football club facilitating this ambitious research project.

The project outputs will be subjectively validated as a tool to answer "what-if" questions related to game play in football assisting coaching staff to visualize speculative game strategies, and as a computational benchmark to quantify cognitive skills of football players. The planned impact activities will ensure the project will leave a legacy in AI development benefiting UK PLC through significant contribution in multiple high growth areas, such as driverless vehicles, video gaming, and assistive robots.

Key Findings

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Potential use in non-academic contexts

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Impacts

Description	This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised

Sectors submitted by the Researcher

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Project URL:

Further Information:

Organisation Website:

http://www.lboro.ac.uk