Annotations in Data Streams

Amit Chakrabarti; Graham Cormode; Andrew McGregor; Justin Thaler

Electronic Colloquium on Computational Complexity

Under the auspices of the Computational Complexity Foundation (CCF)

REPORTS > DETAIL:

Revision(s):

Revision #1 to TR12-022 | 12th August 2013 00:54

Annotations in Data Streams

Revision #1 Authors: Amit Chakrabarti, Graham Cormode, Andrew McGregor, Justin Thaler
Accepted on: 12th August 2013 00:54
Downloads: 2751

Keywords:

Abstract:

The central goal of data stream algorithms is to process massive streams of data using sublinear storage space. Motivated by work in the database community on outsourcing database and data stream processing, we ask whether the space usage of such algorithms can be further reduced by enlisting a more powerful "helper" who can annotate the stream as it is read. We do not wish to blindly trust the helper, so we require that the algorithm be convinced of having computed a correct answer. We show upper bounds that achieve a non-trivial tradeoff between the amount of annotation used and the space required to verify it. We also prove lower bounds on such tradeoffs, often nearly matching the upper bounds, via notions related to Merlin-Arthur communication complexity. Our results cover the classic data stream problems of selection, frequency moments, and fundamental graph problems such as triangle-freeness and connectivity. Our work is also part of a growing trend -- including recent studies of multi-pass streaming, read/write streams and randomly ordered streams -- of asking more complexity-theoretic questions about data stream processing. It is a recognition that, in addition to practical relevance, the data stream model raises many interesting theoretical questions in its own right.

Changes to previous version:

Revised for improved clarity. Includes an improved protocol for Frequent Items and a section highlighting some remaining open questions.

Paper:

TR12-022 | 14th March 2012 23:03

Annotations in Data Streams

TR12-022 Authors: Amit Chakrabarti, Graham Cormode, Andrew McGregor, Justin Thaler
Publication: 18th March 2012 22:34
Downloads: 2772

Keywords:

annotations, data streams, Merlin-Arthur communication complexity

Abstract:

ISSN 1433-8092 | Imprint