EPSRC Reference: |
EP/L021749/1 |
Title: |
Sublinear Algorithms for Approximating Probability Distributions |
Principal Investigator: |
Diakonikolas, Dr I |
Other Investigators: |
|
Researcher Co-Investigators: |
|
Project Partners: |
|
Department: |
Sch of Informatics |
Organisation: |
University of Edinburgh |
Scheme: |
First Grant - Revised 2009 |
Starts: |
01 September 2014 |
Ends: |
31 August 2015 |
Value (£): |
98,776
|
EPSRC Research Topic Classifications: |
Fundamentals of Computing |
|
|
EPSRC Industrial Sector Classifications: |
|
Related Grants: |
|
Panel History: |
Panel Date | Panel Name | Outcome |
04 Feb 2014
|
EPSRC ICT Responsive Mode - Feb 2014
|
Announced
|
|
Summary on Grant Application Form |
The goal of this proposal is to advance a research program of developing
sublinear-time algorithms for estimating a wide range of natural and
important classes of probability distributions.
We live in an era of "big data," where the amount of data that can be brought to bear
on questions of biology, climate, economics, etc, is vast and expanding rapidly.
Much of this raw data frequently consists of example points without corresponding labels.
The challenge of how to make sense of this unlabeled data has immediate relevance
and has rapidly become a bottleneck in scientific understanding across many disciplines.
An important class of big data is most naturally modeled as samples from a probability
distribution over a very large domain. The challenge of big data is that the sizes
of the domains of the distributions are immense, typically resulting in unacceptably
slow algorithms. Scaling up a computational framework to comfortably deal with
ever-larger data presents a series of challenges in algorithms.
This prompts the basic question: Given samples from some unknown distribution, what can we infer?
While this question has been studied for several decades by various different communities of researchers,
both the number of samples and running time required for such estimation tasks
are not yet well understood, even for some surprisingly simple types of discrete distributions.
The proposed research focuses on sublinear-time algorithms, that is,
algorithms that run in time that is significantly less than the domain of the underlying distributions.
In this project we will develop sublinear-time algorithms for estimating various classes
of discrete distributions over very large domains.
Specific problems we will address include:
(1) Developing sublinear algorithms to estimate probability distributions that satisfy various
natural types of "shape restrictions" on the underlying probability density function.
(2) Developing sublinear algorithms for estimating complex distributions that result
from the aggregation of many independent simple sources of randomness.
We believe that highly efficient algorithms for these estimation tasks
may play an important role for the next generation of large-scale machine learning applications.
|
Key Findings |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Potential use in non-academic contexts |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Impacts |
Description |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk |
Summary |
|
Date Materialised |
|
|
Sectors submitted by the Researcher |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Project URL: |
|
Further Information: |
|
Organisation Website: |
http://www.ed.ac.uk |