TR17-183 Authors: Sivakanth Gopi, Venkatesan Guruswami, Sergey Yekhanin

Publication: 28th November 2017 15:18

Downloads: 1386

Keywords:

In recent years the explosion in the volumes of data being stored online has resulted in distributed storage systems transitioning to erasure coding based schemes. Local Reconstruction Codes (LRCs) have emerged as the codes of choice for these applications. An $(n,r,h,a,q)$-LRC is a $q$-ary code, where encoding is as a two stage process. In the first stage, $h$ redundant parity symbols are generated from $k$ data symbols. In the second stage, the $k+h$ symbols are partitioned into sets of size $r-a$ and each set is extended with $a$ redundant symbols using an MDS code to form a local group. Local groups ensure that when at most $a$ coordinates are erased, any missing coordinate can be recovered by accessing at most $r-a$ symbols. Also, if a larger number of coordinates is erased, the missing symbols can be recovered by potentially accessing all remaining symbols.

An $(n,r,h,a,q)$-LRC code as above is Maximally Recoverable (MR), if it corrects all erasure patterns which are information theoretically correctable given the presence of local groups. Obtaining MR LRCs over finite fields of minimal size is important in practice and has been the goal of a line of work in coding theory. In this work we make progress towards this goal. In particular:

- We show that when $a$ and $h$ are constant and $r$ may grow, for every maximally recoverable LRC, $q\geq \Omega_{a,h}\left(n\cdot r^{\min\{a,h-2\}}\right).$ Prior to our work, there was no super-linear lower bound known on the field size of MR LRCs for any setting of parameters.

- We obtain a family of MR $(n,r,h=2,a,q)$-LRCs, where $q=O(n)$ for all settings of parameters. Prior to our work the best constructions required $q$ to be quadratic in $n$ for some regimes.

- We obtain a family of MR $(n,r,h=3,a,q)$-LRCs, where $q=O(n^3)$ for all settings of parameters. Prior to our work the best constructions required $q$ to be $n^{\Theta(a)}$ for some regimes.

- Our results in the first two bullets above suggest the setting of $r=3, a=1, h=3$ as the first setting where existence of MR LRCs over fields of near linear size is an open question. We resolve this question in the positive by developing a new approach to LRC constructions based on elliptic curves and arithmetic progression free sets.