*11 September 2017*

Curse of dimensionality usually refers to the fact that as the dimensionality of the data increases, it becomes exponentially more difficult to search, sample from the posterior, optimize, etc.

First, some some examples to illustrate the weirdness (or unintuitive nature) of a high-dimensional space. Then, some examples of difficulties when working in a high-dimensional space.

The volume of an \(n\)-cube with sides of length \(s\) is \(V_{n, \text{cube}} = s^n\).
The volume of an \(n\)-ball with a radius \(r\) is \(V_{n, \text{ball}} = \frac{r^n \pi^{n / 2}}{\Gamma(n / 2 + 1)}\).
If an \(n\)-ball is inscribed inside an \(n\)-cube, \(s = 2r\), i.e.
\begin{align}
\frac{V_{n, \text{ball}}}{V_{n, \text{cube}}} &= \frac{\frac{r^n \pi^{n / 2}}{\Gamma(n / 2 + 1)}}{s^n} \\

&= \frac{\pi^{n / 2}}{2^n \Gamma(n / 2 + 1)}
\end{align}
where \(\Gamma\) is the Gamma function.
This increases for a bit but then decreases exponentially to zero.

Consider an \(n\)-cube with sides of length \(s = 4\) with the center at \((0, \dotsc 0)\). There are \(2^n\) \(n\)-cubes with sides of length \(2\) with centers at \((\pm 1, \dotsc, \pm 1)\). Hence, we can inscribe \(2^n\) \(n\)-balls inside this small cubes, each with radius \(1\). The distance from \((0, \dotsc, 0)\) to the any of the ball centers is \(\sqrt{n}\). Hence we can inscribe a ball with the center at \((0, \dotsc, 0)\) and radius \(\sqrt{n} - 1\) without going in any of the other balls.

For \(n = 9\), this means that the center ball has radius \(\sqrt{9} - 1 = 2\) and touches the sides of the original cube. For \(n > 9\), the center ball bulges out of the original cube without going in any of the other balls.

Consider an \(n\)-ball (an orange) with radius \(r\) whose peel has thickness \(\epsilon\).
The volume of the peel is
\begin{align}
V_{\text{peel}} &= \frac{\pi^{n / 2}}{\Gamma(n / 2 + 1)} \left(r^n - (r - \epsilon)^n\right).
\end{align}
Hence the fraction of the peel volume divided by the whole orange is
\begin{align}
\frac{V_{\text{peel}}}{V_{\text{orange}}} &= \frac{\frac{\pi^{n / 2}}{\Gamma(n / 2 + 1)} \left(r^n - (r - \epsilon)^n\right)}{\frac{\pi^{n / 2}}{\Gamma(n / 2 + 1)} r^n} \\

&= \frac{r^n - (r - \epsilon)^n}{r^n} \\

&= 1 - \left(\frac{r - \epsilon}{r}\right)^n.
\end{align}
This quantity goes very quickly to one even for a very thin peel.

According to this, for a zero-mean, \(n\)-dimensional normal distribution with variance \(1\), the probability mass in the thin shell \(\left\{x: \sqrt{d - 1} - c \leq \|x\| \leq \sqrt{d - 1} + c, c > 0\right\}\) is \(1 - \frac{\exp{(-c^2 / 4)}}{c^2 / 4 + \exp{(-c^2 / 4)}}\). This goes to one very quickly.

We can also see this by noting that the random variable corresponding to the length of the \(n\)-dimensional normal random variable, \(\sqrt{x_1^2 + \cdots + x_n^2}\) is distributed according to the Chi distribution. The median of this random variable is \(\sqrt{n - 1}\). Its mean is \(\mu = \sqrt{2} \frac{\Gamma((n + 1) / 2)}{\Gamma(n / 2)}\). Its variance is \(n - \mu^2\). [Finish argument] A slightly less precise way is to refer to the Central Limit Theorem.

[back]