The geometry of statistics

It is exciting to witness the surge of geometric tools permeating modern statistical and machine learning methodologies, from sampling and statistical inference to understanding the structure of data. My own work, deriving minimum discrepancy estimators with theoretical guarantees, and developing numerical integration/sampling algorithms, relied almost entirely on geometric tools.

Despite this, there exists a profound skepticism among statisticians regarding geometry. It seems this skepticism primarily stems from two reasons:

The unity of statistics and the bracket measure-formalism

Despite the fact geometric tools are being increasingly leveraged acrosss statistical methodologies, geometry remains notably absent from the curriculum of most statistics departments, underscoring the perception it is not directly pertinent to the training of mathematical statisticians. It turns out that as soon as we use appropriate formalisations of probability distributions, the gap between statistics and geometry disappears. Find out why here with the bracket-measure formalism!

The point is that the way mathematicians think of distributions has been continuously evolving. Continuous probability densities, p(x)dx, became absolutely continuous measures, sigma-normal weights, tensor 1-densities, twisted/pseudo differential forms, smooth deRham currents, classes of Hochschild cycles, berezinian volumes, arrows in the Markov category, and so on. Each of these mathematical formalisation incorporates a new understanding of p(x)dx. For instance tensor 1-densities formalise the probability rate of change which then allows us to correctly talk about the differential information of p(x)dx (which is not its log-density derivative). Without differential geometry we are forced to split p(x) and dx, so that we can differentiate p(x), which is noncanonical and thus isolates statistics from the rest of mathematics. On the other hand, von Neumann algebras shed light on the spaces on which probability measures are defined (which are neither measurable nor measure spaces, but something in between: measure class spaces), and their canonical description via C* algebras provides a first acquaintance with the duality between geometric spaces and algebras of ``coordinates” (i.e., functions).

To fully leverage the structure of probability distributions in statistical models, and facilitate the transfer of specialised geometric techniques across statistical applications, we need to stand on the shoulders of the giants that revolutionised mathematics and physics. This, in my opinion, requires acquiring a deeper understanding of statistical objects that goes (very far) beyond measure/probability theory, as well as incorporating the unity of mathematics within statistical education and methodologies, by constructing a geometric backbone for statistics via the theory of smooth distributions.

" [...] one of the most essential features of the mathematical world, [...] it is virtually impossible to isolate any of the above parts from the others without depriving them from their essence. In that way the corpus of mathematics does resemble a biological entity which can only survive as a whole and would perish if separated into disjoint pieces." Alain Connes