Under the auspices of the Computational Complexity Foundation (CCF)

REPORTS > DETAIL:

### Revision(s):

Revision #2 to TR19-108 | 2nd September 2019 14:29

#### Beating the probabilistic lower bound on perfect hashing

Revision #2
Authors: Chaoping Xing, chen yuan
Accepted on: 2nd September 2019 14:29
Keywords:

Abstract:

For an integer $q\ge 2$, a perfect $q$-hash code $C$ is a block code over $[q]:=\{1,\ldots,q\}$ of length $n$ in which every subset $\{\mathbf{c}_1,\mathbf{c}_2,\dots,\mathbf{c}_q\}$ of $q$ elements is separated, i.e., there exists $i\in[n]$ such that $\{\mathrm{proj}_i(\mathbf{c}_1),\dots,\mathrm{proj}_i(\mathbf{c}_q)\}=[q]$, where $\mathrm{proj}_i(\mathbf{c}_j)$ denotes the $i$th position of $\mathbf{c}_j$. Finding the maximum size $M(n,q)$ of perfect $q$-hash codes of length $n$, for given $q$ and $n$, is a fundamental problem in combinatorics, information theory, and computer science. In this paper, we are interested in asymptotical behavior of this problem. More precisely speaking, we will focus on the quantity $R_q:=\limsup_{n\rightarrow\infty}\frac{\log_2 M(n,q)}n$.
A well-known probabilistic argument shows an existence lower bound on $R_q$, namely $R_q\ge\frac1{q-1}\log_2\left(\frac1{1-q!/q^q}\right)$ \cite{FK,K86}. This is still the best-known lower bound till now except for the case $q=3$ for which K\"{o}rner and Matron \cite{KM} found that the concatenation technique could lead to perfect $3$-hash codes beating this the probabilistic lower bound. The improvement on the lower bound on $R_3$ was discovered in 1988 and there has been no any progress on lower bound on $R_q$ for more than 30 years despite of some work on upper bounds on $R_q$.
In this paper we show that this probabilistic lower bound can be improved for $q=4,8$ and all odd integers between $3$ and $25$, and \emph{all sufficiently large} $q$ with $q \pmod 4\neq 2$. Our idea is based on a modified concatenation which is different from the classical concatenation for which both the inner and outer codes are separated. However, for our concatenation we do not require that the inner code is a perfect $q$-hash code. This gives a more flexible choice of inner codes and hence we are able to beat the probabilistic existence lower bound on $R_q$.

Revision #1 to TR19-108 | 28th August 2019 15:07

#### Beating the probabilistic lower bound on perfect hashing

Revision #1
Authors: Chaoping Xing, chen yuan
Accepted on: 28th August 2019 15:07
Keywords:

Abstract:

For an integer $q\ge 2$, a perfect $q$-hash code $C$ is a block code over $[q]:=\{1,\ldots,q\}$ of length $n$ in which every subset $\{\bc_1,\bc_2,\dots,\bc_q\}$ of $q$ elements is separated, i.e., there exists $i\in[n]$ such that $\{\proj_i(\bc_1),\proj_i(\bc_2),\dots,\proj_i(\bc_q)\}=[q]$, where $\proj_i(\bc_j)$ denotes the $i$th position of $\bc_j$. Finding the maximum size $M(n,q)$ of perfect $q$-hash codes of length $n$, for given $q$ and $n$, is a fundamental problem in combinatorics, information theory, and computer science. In this paper, we are interested in asymptotical behavior of this problem. More precisely speaking, we will focus on the quantity $R_q:=\limsup_{n\rightarrow\infty}\frac{\log_2 M(n,q)}n$.

A well-known probabilistic argument shows an existence lower bound on $R_q$, namely $R_q\ge\frac1{q-1}\log_2\left(\frac1{1-q!/q^q}\right)$ \cite{FK,K86}. This is still the best-known lower bound till now except for the case $q=3$ for which K\"{o}rner and Matron \cite{KM} found that the concatenation technique could lead to a perfect $3$-hash code beating this the probabilistic lower bound. The improvement on the lower bound on $R_3$ was discovered in 1988 and there has been no any progress on lower bound on $R_q$ for more than 30 years despite of some work on upper bounds on $R_q$.
In this paper we show that this probabilistic lower bound can be improved for $q=4,8$ and all odd integers between $3$ and $25$, and \emph{all sufficiently large } $q$ with $q \pmod 4\neq 2$. Our idea is based on a modified concatenation which is different from the classical concatenation for which both the inner and outer codes are separated. However, for our concatenation we do not require that the inner code is a perfect $q$-hash code. This gives a more flexible choice of inner codes and hence we are able to beat the probabilistic existence lower bound on $R_q$.

### Paper:

TR19-108 | 23rd August 2019 13:03

#### Beating the probabilistic lower bound on perfect hashing

TR19-108
Authors: Chaoping Xing, chen yuan
Publication: 23rd August 2019 16:37
For an interger $q\ge 2$, a perfect $q$-hash code $C$ is a block code over $\ZZ_q:=\ZZ/ q\ZZ$ of length $n$ in which every subset $\{\bc_1,\bc_2,\dots,\bc_q\}$ of $q$ elements is separated, i.e., there exists $i\in[n]$ such that $\{\proj_i(\bc_1),\proj_i(\bc_2),\dots,\proj_i(\bc_q)\}=\ZZ_q$, where $\proj_i(\bc_j)$ denotes the $i$th position of $\bc_j$. Finding the maximum size $M(n,q)$ of perfect $q$-hash codes of length $n$, for given $q$ and $n$, is a fundamental problem in combinatorics, information theory, and computer science. In this paper, we are interested in asymptotical behavior of this problem. More precisely speaking, we will focus on the quantity $R_q:=\limsup_{n\rightarrow\infty}\frac{\log_2 M(n,q)}n$.
A well-known probabilistic argument shows an existence lower bound on $R_q$, namely $R_q\ge\frac1{q-1}\log_2\left(\frac1{1-q!/q^q}\right)$ \cite{FK,K86}. This is still the best-known lower bound till now except for the case $q=3$ for which K\"{o}rner and Matron \cite{KM} found that the concatenation technique could lead to a perfect $3$-hash code beating this the probabilistic lower bound. The improvement on the lower bound on $R_3$ was discovered in 1988 and there has been no any progress on lower bound on $R_q$ for more than 30 years despite of some work on upper bounds on $R_q$.
In this paper we show that this probabilistic lower bound can be improved for $q=4,8$ and all odd integers between $3$ and $25$, and \emph{every sufficiently large odd} $q$. Our idea is based on a modified concatenation which is different from the classical concatenation for which both the inner and outer codes are separated. However, for our concatenation we do not require that the inner code is a perfect $q$-hash code. This gives a more flexible choice of inner codes and hence we are able to beat the probabilistic existence lower bound on $R_q$.