Weizmann Logo
Electronic Colloquium on Computational Complexity

Under the auspices of the Computational Complexity Foundation (CCF)

Login | Register | Classic Style

Guang Yang

Cryptography and Randomness Extraction in the Multi-Stream Model


Tsinghua University, Beijing 2015


We initiate and develop from the ground up cryptography and randomness extraction over Very Big Data Objects. These big objects are handled in the multi-stream model, for which we study constructions as well as limitations. We propose new views and concepts, devise new constructions and methods, introduce new lower bound techniques, and perform extensive experimental evaluations.

The multi-stream model captures the essence of computation over external storage, with severely restricted local memory and sequentially making only few passes over the storage. Typically, we consider $O(\log n)$ memory size and constant many read/write streams, where each stream is a big tape that can be scanned sequentially from left to right for only very few times, i.e. constant or slightly above that.

For cryptography over Very Big Data Objects, our study focuses on the theoretical foundations. Based on mild intractability assumptions, we construct one-way functions (OWF) and pseudorandom generators (PRG) within constant many passes. From the Learning With Errors (LWE) assumption, we devise a Public-Key Encryption (PKE) scheme where both the encryption and decryption use constant many passes. We also complement the constructions with impossibility of constant-pass super-linear stretch pseudorandom generators and a linear lower bound for the length of private-keys in PKE. We prove this using a new lower bound technique that we develop.

For randomness extraction over Very Big Data Objects, we obtain a streaming randomness extractor that makes $O(\log\log\frac{n}{\eps})$ passes, which is only slightly above constant. Interestingly, we show that this bound is tight for every randomness extractor in our streaming model. In addition to the theoretical developments, we realize and empirically validate our extractor as an ultra-efficient executable program, which is used to extract randomness from Big Data. This is a novel conceptual view of real-world data as big sources of entropy. Both in theory and in practice, this is the first method able to extract high-quality random bits from big objects.

Table of Contents
1 Introduction
	1.1 Modeling the Computation over Big Data 
	1.2 Cryptography in the Multi-Stream Model 
	1.3 Randomness Extraction in the Multi-Stream Model
2 Cryptography in the Multi-Stream Model  
	2.1 Preliminary–Notation and Background 
	2.2 Warm up: Streaming Encoding of Universal Hash Functions
	2.3 Streaming One-way Functions 
	2.4 Streaming Pseudorandom Generators
	2.5 Streaming Public-Key Encryption  
	2.6 Conclusions and Remarks on Practicality
3 Randomness Extraction in the Multi-Stream Model 
	3.1 The Random Re-Bucketing (RRB) Extractor
	3.2 Extraction from Affine and General Sources 
	3.3 Experimental Study 
	3.4 Experimental Data 
4 Limits of the Multi-Stream Model for Random and Pseudorandom Objects
	4.1 Overview of the Lower-Bound Technique 
	4.2 Lower Bounds for Cryptographic Primitives 
	4.3 Lower Bounds for Randomness Extractors 

ISSN 1433-8092 | Imprint