This project will develop a fundamentally different approach to visual perception & autonomy where the concept of an image itself is replaced with a stream of independently firing pixels, similar to unsynchronised biological cells in the retina. Recent advances in computer vision & machine learning have enabled robots which can perceive, understand, and interact intelligently with, their environments. However, this "interpretive" behaviour is just one of the fundamental models of autonomy found in nature. The techniques developed in this project will exploit recent breakthroughs in instantaneous, non-image-based, visual sensing, to enable entirely new types of autonomous system. The corresponding step-change in robotic capabilities will impact the manufacturing, space, autonomous vehicles and medical sectors.
If we perceive an object approaching at high speed, we instinctively try to avoid the object without taking the time to interpret the scene. It is not important to understand what the object is or why it's approaching us. This "reflexive" behavioural model is vital to react to time-critical events. In such cases, the situation has often already been resolved by the time we become consciously aware of it. Reflexive behaviour is also a vital component of continuous control problems. We are reluctant to take our eyes off the road while driving, as we know that we will rapidly begin to veer off course without a constant cycle of perception and correction. We also find it far easier to pick up and manipulate objects while looking at them, rather than relying entirely on tactile sensing.
Unfortunately, visual sensing hardware requires enormous bandwidth. Megapixel cameras produce millions of bytes per frame. Thus, the temporal sampling rate is low, reaction times are high, and reflexive adjustments based on visual data become impractical.
We finally have the opportunity to overturn the paradigm of vision being impractical for low-latency problems, and to facilitate a step change in robotic capabilities, thanks to recent advances in visual sensor technology. Asynchronous visual sensors (also known as event cameras) eschew regular sensor wide updates (i.e. images). Instead, every pixel independently and asynchronously transmits a packet of information, as soon as it detects an intensity change from its previous transmission. This drastically reduces data bandwidth by avoiding the redundant transmission of unchanged pixels. More importantly, because these packets are transmitted immediately, the sensor typically provides a latency reduction of 3 orders of magnitude (30ms to 30us) between an event occurring and it being perceived.
This advancement in visual sensing is dramatic, but we are desperately in need of a commensurate revolution in robotic perception research. Without the concepts of the image or synchronous sampling, decades of computer vision and machine learning research is rendered unusable with these sensors. This project will provide the theoretical foundations for the robot perception revolution, by developing novel asynchronous paradigms for both perception and understanding. Mirroring biological systems, this will comprise a hierarchical perception framework encompassing both low-level reflexes and high-level understanding, in a manner reminiscent of modern deep-learning. However, unlike deep-learning, pixel-update events will occur asynchronously and will propagate independently through the system, hence maintaining extremely low latency.
The sensor technology is still in its early trial phase, and few researchers are exploring its implications for perception. No group, nationally or internationally, is currently making a concerted effort in this area. Hence, this project not only lays the groundwork for a plethora of new biologically-inspired "reflexive robotics" applications. It will also support the development of a unique new research team, placing the UK at the forefront of this exciting field.
|