Details of Grant

EPSRC Reference:

EP/J012157/1

Title:

Automated Plan-Based Policy-Learning for Surveillance Problems

Principal Investigator:

Fox, Professor M

Other Investigators:

Long, Professor D

Coles, Dr AI

Researcher Co-Investigators:

Project Partners:

Department:

Informatics

Organisation:

Kings College London

Scheme:

Standard Research

Starts:

01 September 2012

Ends:

29 February 2016

Value (£):

370,786

EPSRC Research Topic Classifications:

Artificial Intelligence

Robotics & Autonomy

EPSRC Industrial Sector Classifications:

Information Technologies

Related Grants:

Panel History:

Panel Date	Panel Name	Outcome
15 Dec 2011	Autonomous and Intelligent Systems Meeting	Announced
18 Aug 2011	Autonomous and Intelligent Systems Sift	Announced

Summary on Grant Application Form

Surveillance problems give rise to many challenges including the management of uncertainty in an unpredictable environment, the management of restricted resources and the communication of commitments and requests between multiple heterogeneous agent ``observers''. At the heart of surveillance problems lies the need to plan complex sequences of behaviour that achieve surveillance goals. These goals are typically expressed in terms of gathering as much information as possible given constraints, and communicating findings to a human operator. Planning is combinatorially hard, and planning problems involving metric resources, continuous time and concurrency, as would be required in the solution of non-trivial surveillance problems, are time-consuming to solve. This complexity is greatly exacerbated if uncertainty is captured explicitly within the planning domain models. Although online planning, and plan repair in the case of failure, are feasible in stable situations, they take too long in situations that are changing rapidly. Online planning also requires significant on-board computational resources, which are often not available in surveillance vehicles. Planning under uncertainty cannot therefore be done online in situations typical of many surveillance problems, where computational resources are limited and rapid responses are frequently required. On the other hand, forward planning is certainly required in order to avoid the observers behaving in a purely reactive (and therefore easily distracted) manner.

Since online planning, and planning under uncertainty, are both unrealistic for large-scale, fast-moving surveillance problems, we propose an alternative approach based on plan-based policy-learning. We assume that time and resources are available offline to train effective policies. Our approach is based on Monte Carlo sampling: we sample many instances of the stochastic problem, each instance being a challenging temporal and metric planning problem. We then solve each instance using a high-performing planner, and then apply a classifier to learn a policy as a mapping from states to actions, using the set of solutions as input. We have already demonstrated the effectiveness of this approach in two single-agent cases: management of the loading of multiple batteries, and the control of an autonomous underwater vehicle following the edge of a patch (distinguished by high chlorophyll or high temperature readings) in the coastal waters of the Monterey Bay. We know from our work in both cases that the resulting policies can be very high-performing in terms of robustness to the high degree of uncertainty that often occurs in the physical execution environment. We are now proposing to scale up the approach we took in the batteries and patch-following cases, to the multi-agent coordination problem, addressing the challenges that arise when many agents are coordinating in solving a surveillance problem that requires the integration of multiple policies.

Key Findings

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Potential use in non-academic contexts

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Impacts

Description	This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised

Sectors submitted by the Researcher

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Project URL:

Further Information:

Organisation Website: