The set , where is the set of natural numbers, consists of infinity strings of zeros and ones. It is equipped with mod(2) addition structure and it is called Cantor group, or dyadic group. Another notation is . One studies functions .
One question one can ask is why historically we started with and not with ? Is not from the prospective of computers functions on more convenient to use? I understand that there is a binary correspondence (not necessarily unique) between and but one may forget what kind of structures we had on after we moved to ? In egeneral we can encode everything via zeros and ones but we may lose symmetries and structures that we had on one side when we moved to the other side. So why nowadays we start teaching Fourier analysis on and not on equipped with a uniform measure (I will explain below what is meant by uniform measure on )?
I do not have complete answers to these questions, perhaps Zygmund’s beautiful book on trigonometric series had a nontrivial influence on historical development of Fourier analysis on .
In this post I will try briefly talk about Fourier analysis on . I saw a question on mathoverflow asking what harmonic analysis we have on . I wrote the answer there, and I thought I would share it here as well.
Harmonic analysis on can be different depending on how one enumerates orthonormal basis, so called Walsh functions.
One usually considers Haar measure on which is a probability measure in the following sense: if you have a function which depends only on the first variables, say then
Orthonormal system on is called Walsh system, and perhaps it was first time introduced by Walsh in 1923: this is a certain family of functions on taking values or .
I have seen 4 different enumerations of this family of functions used in the literature (perhaps there are more). The first 3 of them (original Walsh, Paley, and Walsh–Kaczmarz) is well described in the book F. Schipp, WR. Wade, P. Simon, “Walsh series: an introduction to dyadic harmonic analysis”, see Section 1.4. The 4th one which I find kind of exceptional is not there.
1). Walsh’s original enumeration: Walsh defined his system recursively. Later, it turned out that they can be written as follows:
where with , and . In other words is the binary representation of the positive integer . The system is the Walsh’s original system.
Partial sums , where , converge to in , . The latter is equivalent to the statement that
.
Moreover converges to for any provided that . This is a consequence of Carleson’s maximal inequality
Luzin asked in the trigonometric case on the unit circle . Carleson proved for ; Hunt extended to . Billard proved for the original Walsh system for ; Sjölin extended for .
Fejer means are bounded, i.e.,
In the book F. Schipp, WR. Wade, P. Simon, there is a more “martingale proofs” of these results. Dyadic martingales naturally appear here. For example, is a martingale. In fact,
, where is -algebra, which is generated by atoms, where atoms are . In probabilistic language you condition on the first variables, and let the tail be the everything. Then, for example, the statement that
is exactly Doob’s maximal inequality. The book F. Schipp, WR. Wade, P. Simon, “Walsh series: an introduction to dyadic harmonic analysis” provides more martingale techniques. The paper A guide to Calreson’s theorem by Ciprian Demeter has more time-frequency analysis introduction.
- Paley’s enumeration (Walsh–Paley functions). In 1932, Paley proposed a different enumeration of Walsh functions
Paley’s enumeration perhaps is more close to the classical trigonometric system and . We can “identify” with as follows:
for some . This identification is not unique, for example, can be written as and also as . Among these two one choose the first one, i.e., when number of 1’s terminate starting from some place. Such bad points are not too many, they have zero measure so we do not bother with them. In particular is defined for . Then
In general, is close to or depending is even or odd.
Frequencies are independent pretty much in the same way as are independent: Khinchin’s inequality holds for both of them , and .
One has the notion of “dyadic derivative”
provided that the sum converges. Perhaps one should call it “dyadic Laplacian” instead of the derivative. One can verify that Walsh–Paley functions are eigenfunctions: . One can study polynomials of degree , say finite sums , and ask questions similar to Markov–Brother’s and Bernstein estimates for such polynomials. One can ask regularity questions: if is regular on how regular it is on with respect to dyadic derivative and vice versa. What happens with partial sums if we have some regularity in terms of dyadic derivatives etc. The results about convergence of partial sums, Carleson–Hunt theorem also hold for this enumeration. - Sneider’s enumeration (Walsh-Kaczmarz system) was introduced by Sneider:
I won’t say much about this system except that the same convergence theorems hold in this case. Perhaps an interesting observation in all these 3 enumerations is that Dirichlet Kernels
where is either Walsh, Walsh–Paley, Walsh–Kaczmarz, for are the same. In fact .
Frequencies: when working with these 3 enumerations one usually thinks that is “high-frequency” element if is large, and “low-frequency” element if is small. For example for large are high frequency elements similar to . All these 3 enumerations and concepts of frequencies tend to mimic the harmonic analysis on , where is the uniform measure on the unit circle, i.e., trigonometric polynomials.
- Probabilistic enumeration (from independent towards dependent). In this enumeration the concept of “high/low frequencies” is changed in a very opposite way. In fact, the elements will be the lowest frequencies, say frequencies of degree . Elements where has two bits in its binary representation will be frequencies of order . These are precisely Walsh functions , . In a certain sense enumeration starts from “independent frequencies to dependent ones“, and probabilistic ideas, measure concentration phenomena, dimension independent phenomena, and the concept of “independence” happen to be useful. The goal is to represent functions as follows:
Of course, if one takes , writes into Walsh–Paley system , and wants to rearrange the terms to write in the form (1) then rearranged infinity sums may give different results unless which is very unlikely for a “typical” . However, if one considers functions which depend only on the first variables the rearranged sums will be the same. This is not a big restriction on the class of functions because when one wants to prove a statement for functions on one does not start with which depends on all its variables, one starts from a “finite dimensional problem” when depends only on the first variables, and one proves the corresponding statements for such functions, requires the constants involved in the statement to be independent of , and then passes to a limit, i.e., to an infinite dimensional problem.
Thus if we consider functions which depend only on the first variables then
The latter notation seems to be compact and convenient. Here are Fourier coefficients, and are Walsh functions (enumerated in a strange way) indexed by the sets . Since this enumeration works well provided that depends only on finite number of variables, the questions about convergence, Carleson–Hunt theorem etc do not make any sense. So the direction of harmonic analysis in which Luzin was interested ends here. Well, one can still consider partial sums
where denotes the cardinality of the set , and ask similar questions
with uniform bound for all , , for all , and all depending on the first variables. Such estimate holds if and only if .
Interesting connection arises with Gauss space. For example
where is degree probabilists’ Hermite polynomial, is the standard normal Gaussian random variable, and the convergence is in the sense of distributions. By Taking tensor products of the above example one recovers Hermite polynomials on with arbitrary . So, because of this limit, it is reasonable to think about as low frequency element if is small, and high frequency element if is large (which is very opposite compared to the original Walsh enumeration). In fact, the analysis with such enumeration (1) for functions depending on variables is similar to the one on Dscrete Laplacian is defined in a different way
where . One also considers discrete gradient
One has integration by parts formula
for all functions depending on the first variables. Heat semigroup
satisfies heat equation . Discrete gradient
when applied to for some smooth bounded , converges to the classical derivative
for any smooth functions . The case is a typical application. Denoting , then degree polynomials with respect to this enumeration can be written as
and the latter expression can be extended in a natural way for all as a multilinear classical polynomial in variable of degree . Due to the identity tools from approximation theory enter in an unexpected way applied to actual degree multilinear polynomials. Discrete derivatives coincide with the actual derivatives for such multilinear polynomials. Boolean functions have an extra property and this gives more cancelations when proving statements for all real valued functions.
To summarize: Harmonic analysis on depends on how one encodes Walsh functions, i.e., frequencies in the Hilbert space , who are the low and who are the high frequency functions.