Sometimes people mistake distributions for functions. This is due to the fact, that the notation sucks and often people use the same symbols for both. In this post I want to clarify the intuition behind distributions.
Take as an example the expected value of a random variable in with probability density function . Then the expected value is . But does any random variable admit a probability density function ? Of course not. Take as a counterexample the random variable with absolutely no randomness which is always . Can we write this as for some function , which is not zero everywhere?
It turns out we can kind of can do this. There is the so-called dirac distribution (often wrongly called dirac function) which is a gadget which is zero everywhere, except at where its value is infinity. Further, integrating yields . This would have the properties we require of from above.
But why should exist at all? The trick is, that it is not a function but a special type of generalized function. These generalized functions are called distributions.
Let me introduce some more notation. Let be the set of infinitely differentiable (smooth) functions which are zero outside some bounded interval (have compact support). We call those functions test functions. We now define a distribution to be a linear mapping from the space of test functions to . Usually one writes instead of .
As you may have guessed every function induces a distribution . This should be intuitive, since I have written above that a distribution is a generalized function, so a usual function should of course also be a distribution. In particular we define by its action on the test functions through .
The Dirac distribution on the other hand is not induced by a function. It is defined by . I.e., it maps a function to its value at zero. We wrote above , even though we did not yet know that in this case is the dirac distribution and not a function. In the literature stuff like this is often written as to mean . There is similar weird notation floating around, like , which is very misleading. This is not an integral of a function, since is not a function. Don’t get confused. The last equation can be translated as . And then it makes sense again, at least from a notational point.
And that is already everything you need to know to understand distributions. One can take a more sophisticated space of test functions to arrive at more sophisticated distributions. One can define derivatives of distributions, such that it feels natural. One can add and multiply distributions. But the underlying intuition stays the same as above, namely to generalize functions as inducing a linear mapping from a space of test functions to .