Dean’s problem and the Sherringont–Kirkpatrick model
Let be real numbers where . And let be spins, i.e., these are real numbers which take only two values or .
Question (the Dean’s problem): Given real numbers choose the signs so that to maximize the sum
Notice that if for all and then the maximum is attained if all spins , i.e., they have the same sign. In Ising model this is called ferromagnetic case. Clearly in this case the double sum is of order . Here the symbol means that they are comparable up to some multiplicative constant as . We are going to use such a notation very often.
However, if then . Then the maximum is achieved if half of the spins take value and the other half . This is called antiferromagnetic case, and clearly in this case the maximum is of order .
The problem of maximizing the double sum is also known as Dean’s problem: imagine faculty of people, and let be measuring interactions between ‘th and ‘th person. For example, if it means that ‘th person likes ‘th person; if then they dislike each other. By maximizing the sum the Dean is splitting the faculty into two groups (those ones who got assigned , and those ones who got assigned ), so that to maximize the positive interaction in a department.
When are arbitrary real numbers the problem becomes nontrivial. Let us try to see what is the typical order of the maximum of the sum when are i.i.d. symmetric +1 or -1. So in mean field theory (as it also usually happens in mathematics) we look at the sum
where , and the expectation is taken with respect to random variables . This randomized problem, also known as Sherrington–Kirkpatrick model in spin glasses, nowadays is well understood:
In fact a little bit more is known due to the generality of the proof of this fact. For , and , let , and let (this object in physics literature is called Hamiltonian, and it measures negative energy of a system). Next, take any (this is called inverse temperature ), and consider the sum
, here takes expectation with respect to uniform counting measure on , and takes expectation with respect to random variables . Clearly
therefore it would be sufficient to study the object (also called free energy. I should mention that is called partition function).
Theorem(Parisi formula, Talagrand 2006) For any we have
where infimum is taken with respect to all probability measures , and , , is the unique solution of the parabolic PDE
Remark 1(Talagrand, Panchenko): We have the following extension of Sherrington–Kirkpatrick model to a mixed d-spin case.
where , and , , is the unique solution of the parabolic PDE
Remark 2(Auffinger–Chen): In the critical case one can simplify the answer
where , and infimum is taken with respect to all nonnegative finite measures on , i.e,. , and , , is the unique solution of the parabolic PDE
To study existence/uniqueness of the minimizer it would be helpful if the expression we are taking infimum of is convex in . For example, clearly strict convexity in would imply uniqueness of the minimizer.
Remark 3: Let be convex. Then the functional is convex. Indeed, we can recognize Hamilton–Jacobi–Bellman–Isaac PDE in , namely we can “linearize” it as follows:
with boundary condition , then it follows that:
, where supremum is taken with respect to all progressively measurable (with respect to Brownian filtrations) bounded controls , and here is the standard Brownian motion. Clearly the expression under the expectation is convex in , therefore supremum of its averages is convex. So the desired functional is convex .
The problem about convexity of the functional in was open for a quite a while (see for example Panchenko’s conjecture ). Even though the solution essentially fits in the last remark this should not be very surprising, because Feynman–Kac formula together with Hamilton–Jacobi–Bellman–Isaac PDEs are not well popularized in certain communities. Another such example that comes to my mind is this mathoverflow question. Can you prove it without using any probability?
The reader notices that I also include name Isaac in the above remarks. The reason is because it is not always possible to “linearize” the PDE (i.e., write it as supremum of linear objects), unless there is some convexity involved in variable. Sometimes the points can be saddle points, and one may need to invoke representation. See, for example, the recent proof of Ehrhard inequality by Ramon van Handel where representation was used for the quantity where is a test function is the standard Gaussian measure, and .
Remark 4: With a little bit of work one can prove strict convexity of the functional in the case (by appropriate concatenation of optimal controls). This gives existence of unique minimizer .
My final remark is that if we take to be independent random variables (not necessarily identically distributed) such that , , and with some universal constant then due to theorem by Carmona–Hu all results (the limits) will be exactly the same.
Lower bound in the case of arbitrary real coefficients, and Bohnenblust–Hille inequality
Let us go back to the original question when are real numbers. Let
where are real numbers, and .
Question: How can we estimate (from above or below) in terms of the real numbers ?
By certain reasons I will prefer to work with instead of , however, notice that in most of the cases these two quantities are comparable with each other.
By playing vs game, one trivial lower bound would be
. We can even get rid off the number by replacing the measure with uniform counting measure on extreme points of the set where we integrate convex function, namely uniform counting measure on (as we always do in this blog). So in fact we have
where in the right hand side the sum is over all subsets with cardinality . In fact this seems to be a bad lower bound, for example if then in average the left hand side is of order whereas the right hand side is of order . The average case suggests that one may have the following inequality
with some universal positive constant . It turns out that this is true, and in fact a little bit more is true.
Theorem(Defant–Mastylo–Perez): For any on we have
with some universal constant .
There is an interesting development of such type of inequalities, especially about improving the constant in the right hand side. In the beginning the constant was of the form , then it was replaced by , and finally due to recent breakthroughs by Bayart–Pellegrino–Seoane-Supelveda (see also Nunez-Alarcon–Pellegrino–Seoane-Supelveda–Serrano-Rodriguez) in pushing the constant to subexponential order for a very similar problem (on the torus instead of the Hamming cube ), it became clear that one can get the same bound on the Hamming cube as well.
Remark 5: If we consider in the Sherrington–Kirkpatrick model then one can extract that , here poly means something like of order with some universal constant . On the other hand, . Therefore a natural question arises
Conjecture: Show that there exists a universal constant such that for any on with any we have
Exercise: use theorem of Defant–Mastylo–Perez to show that in Aaronson–Ambainis conjecture one can get a lower bound for polynomials whose coefficients are constant in absolute value, say for all .
The proof of Theorem by Defant–Mastylo–Perez estimate is quite nice. Arguments and main ideas to prove such estimate originated in series of works, where the earliest one goes to Littlewood’s 4/3 inequality.
Following Defant–Mastylo–Perez I will illustrate the proof in the case of -homogeneous polynomials, namely for .
Proof of .
Let us sketch the main idea and then give the full proof of it. Let be the best possible constant in the inequality independent of (the reason that such constant exists follows from Theorem 34 due to Ron Blei). It is not hard to see that is nondecreasing. Indeed, if is almost optimizer which achieves the constant up to some small additive error on the Hamming cube , then we have
where we have used that is -homogeneous polynomial with the same norm, and whenever . Next instead of directly bounding the constant one can show
Lemma 1: For all we have
with some universal constant .
Finally iterating the lemma, i.e.,
we obtain that , and the result follows due to the fact that .
The proof of Lemma 1 contains 3 ingredients.
Lemma 2 (Ivanisvili-Tkocz) For any we have for all .
With slightly worse constants the Lemma 2 can be obtained from real hypercontractivity. The bound comes from complex hypercontractivity (and this is currently the best upper bound that I know).
Next, let such that is strictly increasing. If , then we will just use the notation . Instead of , for some with it will be convenient for us to use the notation with . If (where and are disjoint), and , we will define to be a map such that and . Notice that we can also write where
Lemma 3 (Blei’s inequality) For all we have
, where ‘s are real numbers, and denotes complement of in .
Notice that the right hand side contains terms and the map may not be strictly increasing, so we have many extra numbers on the right hand side even though they do not participate in the left hand side. In what follows if is not strictly increasing then we will just put , where is increasing rearrangement (for example, ). In case is only non-decreasing (but not strictly increasing) then we just put . Finally, we need the last polarization inequality invented by Harris.
Lemma 4: For any with we have
, where Markov numbers with if , and if .
The proof is quite standard in approximation theory. are ‘th or ‘th coefficient of Chebyshev polynomials of degree d.
Remark 6: It is not hard to see that for and we have for some universal constant .
Now we combine the lemmas 2,3 and 4 in order to prove the lemma 1.
Take an arbitrary set with . Then
Therefore Lemma 3 implies
. Therefore (again by definiton of ) we obtain
. Taking , using Stirling’s formula, Remark 6, and monotonicity of , we obtain . The lemma 1 (and hence the theorem) is completely proved. .