The negligible, the noticeable and the overwhelming
In this lesson we will meet three concepts related to the asymptotic notions that are especially often used in cryptography: the concepts of negligible, overwhelming and noticeable functions. This lesson assumes some basic knowledge of the concept of probability, as covered by lesson Probability of two events in the Probability theory topic.
1. Negligible functions
Suppose we have a probabilistic algorithm $A$ which solves a problem with probability at least $p > 0$ and reports an error if it fails. The failure probability $q = 1-p$ might be too high for our requirements. However, we would be happy if the failure probability were below some practical threshold, like $2^{-90}$, which is less than the probability of our building being demolished by a meteorite during the computation. We can easily make the failure probability smaller by calling the algorithm again if it failed. Let $A^*$ be an algorithm which takes an input parameter $k$ and calls algorithm $A $up to $k$ times. Then the probability that $A^*$ will fail is $P_{\text{fail}}^{A^*}(k) = q^k$.
For example, when generating an RSA key, one needs to solve the problem of generating a random prime number of given length; we can do this by generating a random number and checking it with some primality test — if the generated number passes the test, then it is the answer we wanted; if the test fails, then we try again and again by generating new random numbers.
Is this a good idea or is an unreasonably large number of calls needed to get a failure probability as small as desired? For concreteness, let $p = \frac12$. Then $q^k = 2^{-k}$. To achieve the very small failure probability $2^{-90}$, we only have to call the algorithm $A$ for $90$ times, which is not very much. For different $p$, the results are similar. So $A^*$ has rather good performance; this is actually the basic idea used for generating RSA keys in practice.
Now suppose we have another algorithm — algorithm $B$ — which accepts some input parameter $k$, runs in time proportional to $k$ (i. e. in time $ck$ for some constant $c$) and fails with probability $P_{\text{fail}}^B(k) = \frac1k$. If we cannot make algorithm $B$ succeed with a higher probability by running it multiple times, like we did for $A$ (because $B$ is not probabilistic or because it doesn't tell whether it failed), the only way to decrease its probability of failure is to increase the parameter $k$. However, to achieve failure probability $2^{-90}$, we would have to make $k = 2^{90}$ — giving a running time vastly longer than anybody could wait even if, in the running time estimate $ck$, the constant $c$ were equal to only 1 CPU cycle).
So, decreasing the failure probability of algorithm $B$ is much harder than that of $A^*$ because the function $\frac1k$ decreases much slower than the function $2^{-k}$.
Figure. The function $\frac1k$ decreases much slower than the function $2^{-k}$.
Functions $\frac1{k^i}$, where $i$ is constant, are not much better than $\frac1k$. For instance, if the failure probability is $P_{\text{fail}}^B(k) = \frac{1}{k^2}$, then we must take $k = 2^{45}$ to achieve failure probability $2^{-90}$. Although the latter might even be practically computable, it is still a very long time compared to $90$ calls of algorithm A.
As before, we use the gap between polynomials and the exponential function as a clear cut-off point. The algorithm $A^*$ is good because its failure probability $P_{\text{fail}}^{A^*}(k)$ approaches zero faster than reciprocal of any polynomial. This leads to the following definition of negligible probabilities.
Definition. A positive function $\mu(n)$ is called negligible, if for any positive polynomial $p(n)$ there exists a constant $n_0$ such that for all $n\geq n_0$ we have $\mu(n)<1/p(n)$.
We can restate the definition in alternative form.
In other words, a positive function is negligible if and only if it grows asymptotically no faster than the reciprocal of any positive polynomial.
Yet another equivalent condition is given by the following lemma.
In other words, even if the bad event with negligible probability would occur in polynomial number of cases the total probability would still approach zero for $n \to \infty$.
Actually, as the following lemma says, to prove that a function is negligible, it suffices to consider only monomials $n^{i}$ instead of all polynomials $p(n)$.
For example, the function $P_{\text{fail}}^{A^*}(k) = 2^{-k}$ considered in the introduction is negligible since for any $i$, $\lim_{k \to \infty} \frac{k^i}{2^k} = 0$. On the other hand, the function $P_{\text{fail}}^{B}(k) = \frac1k$ is not negligible since, for example, because $\lim_{k \to \infty} k\frac1k = 1 \neq 0$.
In crypto, we consider probabilities that are negligible to be too small to matter. In other words, if an attack works only with negligible probability, then we consider that attack to be unsuccessful.
Which of the following statements is true? (Justify your answer.)
- $3^{-n}$ is negligible. Solution
- $n^{-100}$ is negligible. Solution
- $2^n$ is negligible. Solution
- Consider a cryptosystem. On security parameter $\eta$, that scheme uses a block cipher with key length $\log \eta$. The adversary can (only) break the cryptosystem by guessing the key of the block cipher. Then the probability that the adversary breaks the cryptosystem is negligible in $\eta$. Solution
- The same, but with key length $\eta$ (instead of $\log\eta$). Solution
- Optional (more tricky): The same, but with key length $(\log\eta)^2$. Solution
Of course, it is somewhat arbitrary to say that negligible functions are too small to matter, while, e.g., $n^{-80}$ (which is very very small) is considered to matter. The reasons for using this particular class of functions in crypto is that it behaves quite nicely under various operations, this makes things easier in many situations. The following lemma lists some of these nice properties of negligible functions:
Lemma (Some properties of negligible functions).
- If $f_1$ and $f_2$ are negligible, then $f_1+f_2$ is negligible. Proof
- If $f$ is negligible and $c>0$ is a constant, then $c\cdot f$ is negligible. Proof
- If $f$ is negligible and $p$ is polynomially bounded (i.e., there is a positive polynomial $r$ that is bigger than $p$ for sufficiently large $n$), then $p\cdot f$ is negligible. Proof
- If $f$ is negligible, then for any $c>0$, $f^c$ is negligible. (This also includes cases like $f^{1/2}=\sqrt f$.) Proof
An adversary tries to attack a protocol. He can do two things: First, he can try once to guess the master key $k$. Second, he can try as often as he wishes to guess a key $k'$. You may assume that $2^{-|k|}$ and $2^{-|k'|}$ are negligible (i.e., a single try at guessing $k$ or $k'$ succeeds only with negligible probability). The adversary wins if he guesses $k$ (in the first guess) or $k'$ (in any of the guesses).
Show that the probability that the adversary wins is negligible.
(Use Lemma (Some properties of negligible functions); remember also that $\Pr[A_1\vee\dots \vee A_n]\leq \Pr[A_1]+\dots+\Pr[A_n]$.)
2. Overwhelming functions
The next important class of functions in cryptography is that of overwhelming functions. Intuitively, a probability is overwhelming, if it is almost $1$. Formally:
A function $f$ is overwhelming if $1-f$ is negligible.
For example: If we wish to state that a certain message delivery algorithm is secure against denial-of-service attacks, in a formal definition we might require that the probability that the message is delivered is overwhelming. (Which is the same as requiring that the message is lost with negligible probability.)
In a good cryptosystem, which of the following is overwhelming?
- the probability that no successful attack takes place,
- the probability that a successful attack takes place.
3. Noticeable functions
Let us remind the algorithms $A$ and $A^*$ we discussed when introducing the notion of negligibility. We analyzed how the failure probability of algorithm $A^*$ is related to its running time (which is proportional to the number of repetitions). But algorithm $A$ might also have some input parameter (e. g. the length of the prime number to be generated); let's call it $n$. We didn't consider how $n$ affects the running time of $A^*$ — which is fine if we are only interested in a particular value of $n$. But let's investigate now how the performance of $A^*$ scales when $n$ is increased (e. g., if we want to generate longer and longer RSA keys).
Let's denote the worst-case running time of algorithm $A$ with input parameter $n$ by $T_A(n)$. The failure probability of algorithm $A$ might depend on $n$ as well — let's denote it by $q(n)$.
The worst-case running time of algorithm $A^*$, i. e. the $k$-times repetition of algorithm $A$ with input parameter $n$, is then obviously $T_{A^*}(k,n) = kT_A(n)$, and the failure probability of $A^*$ is $P_{\text{fail}}^{A^*}(k,n) = q(n)^k$. To choose $k$, let's fix a threshold $\epsilon > 0$ (the same for all $n$) such that we are satisfied if $A^*$ has failure probability at most $\epsilon$. When running $A^*$ with input parameter $n$, we of course choose $k$ to be the smallest number such that $q(n)^k < \epsilon$. Denote this number by $k_0(n, \epsilon)$. So the running time of $A^*$ is $T_{A^*}(n) = k_0(n, \epsilon)T_A(n)$.
As we saw before, thanks to the fact that $P_{\text{fail}}^{A^*}(k,n)$ is negligible in $k$, i. e. decreases very fast when $k$ increases, the running time $T_{A^*}(n)$ grows slowly if we decrease $\epsilon$. However, that analysis didn't cover how $T_{A^*}(n)$ behaves with respect to $n$; this behavior may be not as good. If the success probability of algorithm A, $p(n) = 1-q(n)$, decreases very quickly with $n$, then the number of repetitions must also grow very quickly, so that algorithm $A^*$ may get very slow for large $n$, even if the time complexity of algorithm $A$ is small.
To be sure that the number of repetitions needed doesn't grow too fast with $n$, we need a lower bound on the probability $p(n)$ that would guarantee that $p(n)$ does not decrease too fast. This is in contrast with the concept of neglibility, which gave an upper bound on the failure probability $P_{\text{fail}}^{A^*}(k)$, guaranteeing that it converges to $0$ sufficiently fast. So we need a new asymptotic notion: if the function $p(n)$ doesn't get near $0$ too fast, it is called noticeable. The precise definition of noticeable is given below.
A positive function $f$ is noticeable if there exist a positive polynomial $p$ and a number $n_0$ such that $f(n)\geq 1/p(n)$ for all $n \geq n_0$.
Remark. Using the definition of noticeable given above, our heuristic claim, that algorithm $A^*$ gets very slow for large input parameters $n$ unless algorithm $A$ has a noticeable success probability, can actually be made precise. Namely, it can be proven that if algorithm $A$ has polynomial time complexity, then algorithm $A^*$ has polynomial time complexity too if and only if the success probability of A, $p(n)$, is noticeable.
Again, there are multiple alternative equivalent definitions:
- $f$ is noticeable;
- $f(n) \in \Omega(1/p(n))$ for some positive polynomial $p$;
- $p(n)f(n) \to \infty$ for some positive polynomial $p$;
- $n^i f(n) \to \infty$ for some $i \in \mathbb N$.
Consider the function $f(n):=1$ for even $n$ and $f(n):=2^{-n}$ for odd $n$. Is $f$ negligible? overwhelming? noticeable?
A warning: the word non-negligible is sometimes used as a synonym for noticeable, and sometimes to mean a function that is not negligible. Be aware of this when you see the word non-negligible!