EPSRC Reference: |
EP/T014628/1 |
Title: |
Session Types for Reliable Distributed Systems (STARDUST) |
Principal Investigator: |
Gay, Professor SJ |
Other Investigators: |
|
Researcher Co-Investigators: |
|
Project Partners: |
|
Department: |
School of Computing Science |
Organisation: |
University of Glasgow |
Scheme: |
Standard Research |
Starts: |
01 October 2020 |
Ends: |
30 September 2024 |
Value (£): |
563,806
|
EPSRC Research Topic Classifications: |
Fundamentals of Computing |
Software Engineering |
|
EPSRC Industrial Sector Classifications: |
|
Related Grants: |
|
Panel History: |
|
Summary on Grant Application Form |
VISION
Distributed software systems are an essential part of the infrastructure of modern society. Such systems typically comprise diverse software components deployed across networks of hosts. Ensuring their reliability is challenging as software components must correctly communicate and synchronise with each other, and any of the hardware or software components may fail. Software failure and service outage are estimated to cost the world economy more than a trillion dollars annually.
Failures can occur at all levels of the system stack: hardware, operating system, network, software, and user. Here we focus on using advanced programming language technologies to enable the software level to better handle failures using a combination of fault prevention and fault tolerance. Specifically, we will combine the communication-structuring mechanism of session types with the scalability and fault-tolerance of actor-based software architectures.
Actor languages are based on independent processes (actors) communicating by asynchronous messages. Reliability is facilitated by actors having isolated state, and hence an actor can fail independently. Two key techniques for achieving reliability in actor languages are timeouts and supervision, and these are the main focus of STARDUST. Timeouts allow failures to be identified during execution, and failures are handled by establishing triggers and alternative behaviours within the code. An actor may supervise other actors, detecting failures and taking remedial action like restarting a failed actor.
Session types provide a way to specify and constrain the communication behaviour (protocol) between nodes in a system. A session type system excludes any non-conforming behaviour, perhaps statically (for fault prevention), dynamically, or a mixture of both (for fault tolerance). Several languages now have session-type support via libraries and tools.
By combining the strengths of actor languages and session types, we will develop a well-founded theory of reliable actor programming with clearly defined communication structures. Key aims are to deliver tools that provide lightweight support for developers, e.g. warn of potential issues, and to allow developers to continue to use established idioms. By doing so we aim to deliver a step change in the engineering of reliable distributed software systems.
|
Key Findings |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Potential use in non-academic contexts |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Impacts |
Description |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk |
Summary |
|
Date Materialised |
|
|
Sectors submitted by the Researcher |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Project URL: |
|
Further Information: |
|
Organisation Website: |
http://www.gla.ac.uk |