Quantcast
  • Register
PhysicsOverflow is a next-generation academic platform for physicists and astronomers, including a community peer review system and a postgraduate-level discussion forum analogous to MathOverflow.

Welcome to PhysicsOverflow! PhysicsOverflow is an open platform for community peer review and graduate-level Physics discussion.

Please help promote PhysicsOverflow ads elsewhere if you like it.

News

PO is now at the Physics Department of Bielefeld University!

New printer friendly PO pages!

Migration to Bielefeld University was successful!

Please vote for this year's PhysicsOverflow ads!

Please do help out in categorising submissions. Submit a paper to PhysicsOverflow!

... see more

Tools for paper authors

Submit paper
Claim Paper Authorship

Tools for SE users

Search User
Reclaim SE Account
Request Account Merger
Nativise imported posts
Claim post (deleted users)
Import SE post

Users whose questions have been imported from Physics Stack Exchange, Theoretical Physics Stack Exchange, or any other Stack Exchange site are kindly requested to reclaim their account and not to register as a new user.

Public \(\beta\) tools

Report a bug with a feature
Request a new functionality
404 page design
Send feedback

Attributions

(propose a free ad)

Site Statistics

205 submissions , 163 unreviewed
5,047 questions , 2,200 unanswered
5,345 answers , 22,709 comments
1,470 users with positive rep
816 active unimported users
More ...

  Why isn't the path integral rigorous?

+ 8 like - 0 dislike
12087 views

I've recently been reading Path Integrals and Quantum Processes by Mark Swanson; it's an excellent and pedagogical introduction to the Path Integral formulation. He derives the path integral and shows it to be: $$\int_{q_a}^{q_b} \mathcal{D}p\mathcal{D}q\exp\{\frac{i}{\hbar}\int_{t_a}^{t_b} \mathcal{L}(p, q)\}$$

This is clear to me. He then likens it to a discrete sum $$\sum_\limits{\text{paths}}\exp\left(\frac{iS}{\hbar}\right)$$ where $S$ is the action functional of a particular path.

Now, this is where I get confused. He claims that, because some of these paths are discontinuous or non-differentiable and that these "un-mathematical"1 paths cannot be disregarded, the sum is not mathematically rigorous, and, thus, that the transition amplitude described by the path integral is not rigorous either. Please correct me if I am incorrect here.

Furthermore, he claims that this can be alleviated through the development of a suitable measure. There are two things that I don't understand about this. First, why isn't the integral rigorous? Though some of the paths might be difficult to handle mathematically, they aren't explicitly mentioned at all in the integral. Why isn't the answer that it spits out rigorous? And, second, why would a measure fix this problem?


1 Note: this is not the term he uses

This post imported from StackExchange Physics at 2015-05-13 18:55 (UTC), posted by SE-user jimmy
asked Mar 11, 2015 in Theoretical Physics by jimmy (40 points) [ no revision ]
Short answer: To define an integral rigorously, it's not enough to just say "and now take the limit $N \to \infty$". You need to prove that your discrete sum converges to something, and that it doesn't matter how you take the limit.

This post imported from StackExchange Physics at 2015-05-13 18:55 (UTC), posted by SE-user Javier
@Javier Badia does this have to do with non-differentiable paths or is it a separate issue?

This post imported from StackExchange Physics at 2015-05-13 18:55 (UTC), posted by SE-user jimmy
Can't one make everything work with proper regularization, and doesn't this regularization allow all the cases actually relevant to physics (as opposed to all possible mathematical corner cases)?

This post imported from StackExchange Physics at 2015-05-13 18:55 (UTC), posted by SE-user DanielSank
It's 'excellent and pedagogical' compared to what other presentation?

This post imported from StackExchange Physics at 2015-05-13 18:55 (UTC), posted by SE-user NikolajK
It is indeed folklore that path integral is not rigorous mathematically, or more precisely, the rigorous maths has not yet been rigorously developed. This is typical in physics. But the real problem is that, many people do not know they are doing handing waving when they are doing it.

This post imported from StackExchange Physics at 2015-05-13 18:55 (UTC), posted by SE-user Jiang-min Zhang
@NikolajK just in general. it is my first introduction to path integrals, but I am finding that the book is not too difficult to follow

This post imported from StackExchange Physics at 2015-05-13 18:55 (UTC), posted by SE-user jimmy

See also this discussion.

3 Answers

+ 6 like - 0 dislike

There are several points:

  • The first is that for usual self-adjoint Hamiltonians of the form $H=-\Delta +V(x)$, with a common densely defined domain (and I am being very pedantic here mathematically, you may just ignore that remark) the limit process is well defined and it gives a meaning to the formal expression

    $\int_{q_a}^{q_b} \mathcal{D}p\mathcal{D}q\exp\{\frac{i}{\hbar}\int_{t_a}^{t_b} \mathcal{L}(p, q)\}$

    by means of trotter product formula and the corresponding limit of discrete sums. So the object has most of the time meaning, as long as we see it as a limit. Nevertheless, it would be suitable to give a more direct mathematical interpretation as a true integral on paths. This would allow for generalizations and flexibility in its utilization.

  • It turns out that a suitable notion of measure on the space of paths can be given, using stochastic processes such as brownian motion (there is a whole branch of probability theory that deals with such stochastic integration, called Itô integral). To relate this notion with our situation at hand there is however a necessary modification to make: the factor $-it$ in the quantum evolution has to be replaced by $-\tau$ (i.e. it is necessary to pass to "imaginary time"). This enables to single out the correct gaussian factors that come now from the free part of the Hamiltonian, and to recognize the correct Wiener measure on the space of paths. On a mathematical standpoint, the rotation back to real time is possible only in few special situations, nevertheless this procedure gives a satisfying way to mathematically define euclidean time path integrals of quantum mechanics and field theory (at least the free ones, and also in some interacting case). There are recent works of very renowned mathematicians on this context, most notably the work of the fields medal Martin Hairer (see e.g this paper and this one, or the recent work by A. Jaffe that gives an interesting overview; a more physical approach is given by Lorinczi, Gubinelli and Hiroshima among others).

  • The precise mathematical formulation of path integral in QM is called Feynman-Kac formula, and the precise statement is the following:

    Let $V$ be a real-valued function in $L^2(\mathbb{R}^3)+L^\infty(\mathbb{R}^3)$, $H=H_0+V$ where $H_0=-\Delta$ (the Laplacian). Then for any $f\in L^2(\mathbb{R}^3)$, for any $t\geq 0$: $$(e^{-tH}f)(x)=\int_\Omega f(\omega(t))e^{-\int_0^t V(\omega(s))ds}d\mu_x(\omega)\; ;$$ where $\Omega$ is the set of paths (with suitable endpoints, I don't want to give a rigorous definition), and $\mu_x$ is the corresponding Wiener measure w.r.t. $x\in\mathbb{R}^3$.

This post imported from StackExchange Physics at 2015-05-13 18:55 (UTC), posted by SE-user yuggib
answered Mar 11, 2015 by yuggib (360 points) [ no revision ]
thanks, great answer

This post imported from StackExchange Physics at 2015-05-13 18:56 (UTC), posted by SE-user jimmy
+ 1 like - 0 dislike

The Lagrangian involves derivatives but the differentiable functions have measure zero in any useful definition of a measure on a function space (for example the Wiener measure), so they would integrate to zero. This makes the introduction of the approximate path integral mathematically dubious.

The second dubious point is that the sum is highly oscillating and not absolutely convergent. Hence unlike for a Riemann integral, the sum doesn't have a sensible limiting value as you remove the discretization. The sum depends on how one orders the pieces and (as is well-known for many divergent alternating series) can take any desired value depending how you arrange the terms. Note that there is no mathematically natural ordering on the set of paths. Thus the recipe doesn't give a fixed result.

Therefore the informal introduction is only an illustration - an informal extrapolation of what can be made well-defined in finite dimensions (through treating $i$ as a variable and analytic continuation from real $i$ to $i=\sqrt{-1}$). In quantum field theory it is just used as a (very useful) formal tool to be used with caution since it may give wrong results, and only practice teaches what caution means.

Making the path integral well-defined in the setting discussed requires far more mathematical background and has been rigorously achieved only for quadratic Lagrangians in any dimension, and in dimension $<4$ for some special classes of nonquadratic ones. The mathematically simplest case is in $d=2$, where the construction of the measure (for real $i$), and the analytic continuation to $i=\sqrt{-1}$ are rigorously derived - it takes a whole book (Glimm and Jaffe) to prepare for it.

answered May 17, 2015 by Arnold Neumaier (15,787 points) [ revision history ]
edited May 17, 2015 by Arnold Neumaier

The derivative term in the path integral $\dot{x}$ does not mean that the function $x(t)$ is differentiable. it means something else, namely $x(t+\epsilon)-x(t)\over \epsilon$ for the $\epsilon$ of your regularization. This quantity does not commute with $x(t)$ in the path integral, because $x(t+\epsilon) \dot{x} - x(t)\dot{x}$ is $(x(t+\epsilon)-x(t))^2\over \epsilon$ which is a fluctuating quantity which averages to 1 as a distribution in the standard quadartic path integrals.

This is not a problem in defining the path integral rigorously, because Feynman never expected that the typical $x(t)$ would be differentiable when writing down $\dot{x}$ in the path integral! He explicitly said it would be nondifferentiable, and he identified the non-differentiability as the path-integral origin of the Heisenberg commutation relation, whose imaginary time form I gave above.

There is never an expectation when you write a path integral that the differentiation operations are being applied to differentiable functions, when you are done with the continuum limit, the differentiation operators are applied in the distributional sense to distributions, the typical paths in the path integral are well defined as distributions in nearly all cases of interest. Further, the products of distributions in the path integral which have coinciding singularities are not interpreted as pure products either, these would not make sense, but rather as products in a regularization $\epsilon$, with subtractions which make them well defined in the limit of small $\epsilon$. The two ideas together, of operator products and distributional derivatives, give meaning to every term in the Lagrangian, without any a-priori assumptions on the character of the path, other than that it can be sampled by Monte-Carlo sampling (which is true numerically, and defines a probabilistic algorithm for making complete sense of the procedure).

The proper interpretation of the path integral is by a limiting procedure which introduces an $\epsilon$ and takes a limit at the end, i.e. a regularization and renormalization. This is also true in 0+1d quantum mechanics, as I tried to make clear using the noncommutativity of $\dot{x}$ and $x$. In 1d, there is no remaining problem with products in the path integral, other than this finite noncommutativity, and mathematicians have already defined a rigorous calculus, Ito calculus, for the imaginary time formulation of 0+1d. In field theory, there are also regularization issues with coinciding products, and the commutation relations expand to OPE relations, and you can't use the same calculus. This is also true for Levy field theory, where you replace the $\dot{x}$ term with the log of a Levy distribution between consecutive steps, to describe a particle making a Levy flight. This has a continuum limit also, but it is not described by Ito calculus, rather by a different calculus which has not been described by mathematicians.

The only issue with the path integral is how to take limits of statistical sampling for small lattices, or relaxing any other regulator. The limit is universal for small $\epsilon$, it doesn't depend on the discretization, just as the limits of discrete difference equations are universal in the limit of small $\epsilon$ and give you the differential equations of calculus. Even in calculus, it is difficult to prove convergence of arbitrary discretizations, the standard proofs of existence/uniqueness of differential equations iterate the integral form of the equation and don't bound the convergence of Runge-Kutta schemes rigorously, as the iterated integral equation already takes a limit implicitly inside the integral, and the convergence proof is made easier. You can't use such tricks in quantum field theory, except when you formulate the theory as an SDE (which is what Hairer does, and he iterates an integral equation to find his solutions).

The formal mathematics for describing the limits of measures to continuum measures is complicated by the difficulties mathematicians have for making a natural measure-friendly set theory, as current set theory makes any discussion of measure a minor nightmare, because you have to constantly keep in mind which sets are measurable and which are not when using all the classical theorems. This is not acceptable, and the better solution is to work in a universe where every subset of [0,1] is automatically measurable, and keep in mind that there are exceptions to the Hahn-Banach theorem, to the basis-existence theorem, the prime ideal theorem, in cases where the is an uncountable choice required in intermediate steps. Since all the actual mathematical applications require only countably many consecutive choices, the exceptions are isolated from mathematical practice by a brick wall, measure paradoxes hardly ever bother probabilists. When they have a construction which they prove is true for all real numbers, they will lift it to "the values" of random variables without worrying.

@RonMaimon: I know the conventional argument to make both points work heuristically. But the purpose of my answer was to explain the comments of Mark Swanson mentioned in the original question, not to justify the derivation.

The real problem is to show that the final limit exists and has the correct properties. If you work in a universe where all sets are measurable you need to redo most functional analysis from scratch before you have a rigorous version, as much of our argumentation in functional analysis is based on the axiom of choice, which you have to renounce renounce.

But since we all were trained in ZFC, it is very hard for any of us (including you) to make sure that nowhere in the arguments use is made of  the axiom of choice. Thus you'd first have to write papers on redoing standard analysis in the new framework. And after having constructed the Euclidean measure you need to prove that the limit is covariant and satisfies the Osterwalder-Schrader axioms. Then you would have to reprove that the Osterwalder-Schrader remains valid without the axiom of choice.  All this is highly nontrivial (though in large parts pedestrian) work  - just saying ''it is easy to see that everything except ... remains true'' won't convince anyone except yourself. Indeed, it is very easy to fool oneself. So you'd probably have to provide a computer-assisted proof, like Tom Hales did for the proof of the Kepler conjecture, which was too difficult for other mathematicians to check by conventional means.

Note that even Hairer accepts the conventional rigorous probabilistic setting, and he got the fields medal since his results were proved in this framework. (By the way, didn't you promise a review of his main work?)

@ArnoldNeumaier: The results of functional analysis which you use in physics do not ever rely on the axiom of choice applied to collections of size continuum or larger. They are always using the axiom of dependent choice in the cases where they are applied. The uniform generalizations of these theorems to ridiculously infinite cases that never show up in applications are what require uncountable choice.

For example, when you are using the Hahn-Banach theorem in a separable Hilbert space, you have a countable basis for the Hilbert space, then you are only using dependent choice. When you extend this theorem to apply to arbitrary nonseparable spaces, you can produce situations where the extended theorem makes a contradiction with probability.

It is much easier to redo all of real analysis in a model where you have all sets measurable and simultaneously dependent choice, as was clear already in the 1970s. The only time you use choice in actually useful arguments, the number of sets in the product is countable, and the choices are simply consecutively specifying a list of objects. If you have to specify an inductive list that is as long in length as the real numbers are wide in girth, then you are in trouble, but then you are doing something silly, as can be seen immediately by asking how you inductively go through all the Gaussian random real numbers one could generate (this question does not make sense in usual mathematics).

Your argument is the same FUD that mathematicians have been spreading for 100 years already against rejecting continuum choice. Continuum choice, unlike dependent choice, is ridiculous, it has no benefits for mathematics, and it should have been scrapped in 1972. This is not made clear, because countable dependent choice is actually necessary for clean arguments in mathematics, even for proving the countable union of countable sets is countable, or that you can find an infinite sequence in a compact set, or anything else intuitive and simple like this. There are no intuitive cases where you need to specify a list the size of the continuum.

I am not "trained in ZFC", I never was. I rejected the well-ordering of the continuum when I was 18 years old, and from this point on, I had to carefully kept track of which theorems use continuum choice, and which required only countable dependent choice, and I only believed the ones that use dependent choice. When you admit random real numbers into your universe, uncountable choice is simply false, and I didn't want to believe false theorems. This position made it difficult to read mathematics for approximately 10 years, because I couldn't translate theorems other people proved and use them without worrying. But then I learned about V=L, and I could immediately see what to do--- you simply interpret all the theorems crazy people (i.e. non-probabilist mathematicians) prove as theorems in L, and you keep in mind that the L real numbers are measure zero.

When you do this, you can use all the results in the literature without worrying--- standard arguments are in L, arguments in probability are in the full universe, and any contradiction betweeen the two disappears, and you are comfortable again. The theorems which are true are stated as follows "For any Hilbert space H whose basis is in L, and a linear operator on H" etc etc. Then you just have to know that L is to be viewed a measure-zero dust in the real (i.e. measurable) universe.

Hairer claims to accept the conventional probabilistic setting, and he claims that his results are proved within this setting, but you have never tried to actually unpack his theorems. The arguments require you to already accept the properties of Weiner chaos and existence of solutions of SDE's on infinite volume spaces, and these results are not proved by Hairer, they are just probability folklore.

But he further makes arguments in his paper which are inconsistent with the standard view: for example, he blithly talks about using two identical "realizations" of specific random objects, if you read the paper on coupling he defines a coupling between two solutions with the same realization of the noise. To define this intuitively is immediate, but to define it in the standard framework requires contortions.

I can review Hairer's work, but you won't convince me that it is in the standard framework of probability, no matter what he says. I read his work, and I know how he proves the results, and it is just extremely easy to slip up and start to talk about solutions of SPDE's, where the functions are random variables, as actual functions. Such reasoning obviously will never lead to paradox, because they are functions in "measurable reality", but they don't lead to a paradox only because you are assuming that none of the arguments you are using produce non-measurable sets. They won't, because nobody would define such stupid things in the literature related to PDEs or SPDE's.

To see how you can make a paradox, consider the space of all solutions to Hairer's translationally invariant SPDE. Define an equivalence relation between functions which says f and g are identical when there is a translation T which takes f to g. Then choose one representative for each equivalence class, and call this the "translation primitive" function, all the rest are translates of the primitive (uncountable axiom of choice). Then ask "what is the probability density on translations which move the solution to a primitive solution?" This doesn't make sense, as it would be a uniform measure on R^3. This is an obvious paradox, but you can make arbitrarily subtle ones.

Hairer is simply sidestepping these questions by using results from SPDE's, which is a field where measure paradoxes are routinely ignored. There is no patience for this nonsense among probabilists, who have better things to worry about than measure paradoxes. Hairer doesn't make political trouble, so he gets prizes. But I don't mind making political trouble where it is necessary. I actually thought there would be more controversy about his stuff, because of this, but it seems mathematicians have grown up regarding probability in the last decades.

It is much easier to redo all of real analysis in a model where you have all sets measurable and simultaneously dependent choice, as was clear already in the 1970s.

Who did it to the satisfaction of the mathematical community?

you are comfortable again

maybe you are comfortable but you don't live alone in the world, and to be believed you need to make the experts in the field comfortable.

Wiener measure is rigorously proved to exist n infinite volume spaces of any finite dimension, so is the existence of solutions of ordinary SDE's using Ito's integral. I haven't followed the literature on stochastic PDE, but it is precisely there that Hairer seems to do new things, and the experts in the field have found his work convincing enough to bestow on him a Fields medal. In particular, realizations of a random process are well-defined trajectories if the process is well-defined. 

But where is the Fields medal for functional a analysis without unmeasurable sets that proves otherwise intractable things?

it seems mathematicians have grown up regarding probability in the last decades.

Their grrowing up consisted in being well trained in rigorous stochastic processes along the traditional lines, and understand things in these terms.

@ArnoldNeumaier: How naive you are! Do you really think that the "proofs" of the existence of Weiner measure in infinite volume is immune to paradox? The mathematicians use their intuition left and right when proving these theorems! They prove bounds for real numbers, then apply these bounds to random variables, and never do they check that the sets invoved are measurable, because the people who prove the probability theorems are not the same people who prove the set theoretic nonsense, and they don't even appreciate that there could be a problem in applying a theorem proved on reals to random variables (at least not past their first probability course).

That doesn't mean that the literature is full of wrong results, it isn't, because it is consistent to assume all sets are measurable, so you never hit a contradiction if you just do the normal things day to day without worrying about measurability. What it does means is that the literature is not at all rigorous, hardly any of the results in probability, especially about Wiener measure in higher dimension, are really proved in a way providing a sketch of an embedding into ZFC (Wiener's original construction does do this for the 1d Wiener measure, but even this work is simplified by going to a measurable universe). There is an implicit assumption in all the work about the measurable character of all the operations that are applied to random variables, and were you to construct a non-measurable example, this is easy to do, only then would they whip out the machinary, and say "oh yes, this is false, because this set is not measurable". But in standard mathematics, you never know when a set is nonmeasurable, because for all you know, somewhere in the lemmas proving a particular real number result, you have an application of the Hahn Banach theorem, and then, presto, some set ends up non-measurable. This doesn't happen only because nobody really ever uses continuum choice in practice, and all the theorems are basically immune to this worry. But this is not at all a theorem, the opposite is a theorem in ZFC, you can end up with nonmeasurable sets all the time.

Every probabalist already automatically works in the measurable universe implicitly, whenever they think about their random functions as taking actual values, and the acceptance of their theorems by the wider mathematical community in 2000s (like the rejection of their theorems in the 1940s) is entirely a social function of what arguments are admitted as allowable by convention and what arguments are rejected as disallowable, by convention. I don't operate by convention, when I prove something is true, I expect it to be true in a formal system, and that I personally see the sketch of the proof in the formal system.

This social thing is a disease. It is pure barrier to entry, and there is no way to overcome it by playing along. It is extremely important to ignore all other people regarding this stuff, because they are contemptibly stupid regarding this, due to the weight of history that they must pay lip service to. I genuinely don't know whether other people will accept what I am saying here, but I don't care, as so long as it is true, and it is, It doesn't matter what other people think. They'll eventually catch up.

@RonMaimon We had previously a discussion about the topic, and I just want to make a small comment.

First of all a bibliographical remark: many many mathematicians are well aware of all the pathologies that occurs with or without the axiom of choice, and whether it is used for a result or not. And there are results that do not hold without the full choice also in reasonable weaker versions, so one has to be particularly careful. A good account on the axiom of choice may be found e.g. in the book "Axiom of choice" by Herrlich. In addition, the constructivist branch of mathematics (not very popular, but nevertheless interesting) does not use the full axiom of choice but only constructible weaker versions. Many results of real analysis can be derived in this context, as discussed in an interesting book by Bishop (Foundations of constructive analysis).

Then I would also make a point of principle, that more or less is the same that I made in our previous discussion (so I know you would not be convinced, but I make it anyway). ZF, ZF+CC, ZF+DC ... are logical theories that are weaker than ZFC (I use the Bourbaki definition, a logical theory $T$ is stronger than $S$ if $T$ and $S$ has the same signs, every axiom of $S$ is a theorem of $T$ and every schema of $S$ is a schema of $T$; it follows that every theorem of $S$ is a theorem of $T$). So if you prove something in ZF, ZF+CC,... immediately you have a theorem of ZFC. On the other hand, the Solovay model is "disjoint/in contradiction" w.r.t. ZFC, since there are theorems of ZFC that are false in the Solovay model and vice-versa (and measurability of sets is not the unique difference, for example in the Solovay model the dual of $L^\infty$ is $L^1$). And when one has to deal with objects that are a little bit more abstract than "simple" real analysis, e.g. distributions (that usually are defined as duals of a Frechet space), Banach algebras (e.g. the algebra of CCR is always non-separable) it may be very difficult to distinguish which ZFC results may be true, false or undecidable in the Solovay model. Of course all the results that hold in ZF+DC hold also in ZFC, but I am afraid that when you deal with duality, compactness, non-separability, it is not difficult to incur in objects that are only definable with the full choice. So what you may gain in dealing with reals in the Solovay model, you may then lose in defining other structures, that are often very important and interesting also from a physical standpoint. 

In my opinion the constructivist program (and in general doing math in ZF + something else that is not choice) is a little bit like string theory (but on a different context): it is promising, because the unintuitive results due to choice can be avoided; but on the other hand it is also a nightmare, or in some case impossible, to prove many established results that are theorems of ZFC and are, undoubtedly, very important in math itself and in applications: e.g. some results of functional analysis, algebras of operators, geometry.

@yuggib: could you please link to the previous discussion you refer to? 

@yuggib: The content of your comment is that "Mathematicians are used to ZFC by now, and don't want to label all theorems which use uncountable choice as second-class theorems". But unfortunately, no matter what, you need to label every single theorems which uses uncountable choice, because these theorems can't be applied to random variables, they are false for random variables. You are never going to stop people from imagining that random variables take on values, they still talk this way after a century of warning against it, and this intuition is only reinforced by modern probability. Can an Ising model theorist really believe that there is no such thing as an equilibrated Ising model configuration on the infinite 2d plane? Currently they must, because this concept doesn't make sense in ZFC. See this answer: http://math.stackexchange.com/questions/171924/advantage-of-accepting-non-measurable-sets/1287938#1287938 .

A theorem that you know for all reals which cannot be applied to Gaussian random variables is less than useless, it  requires you to doubt the concept of "a Gaussian random real", and it requires you to reprove every single theorem twice, once for the set of real numbers admitting choice, then again more carefully for random variables not admitting choice. The only things you throw away for random variables are exactly those theorems that used uncountable choice somewhere. So you need to keep track of when you are using choice already anyway, to make sure your theorem applies to random variables.

A theorem that is true for all Ising model configurations but which can't be applied to any equilibrated Ising model configurations is less than useless in the same way, except more obviously. In the linked answer, I gave an example of an Ising model paradox: define two configuratons of the 2-d Ising model on Z_2 as equivalent if they can be translated into each other. Choose one configuration from each equivalence class, and call it the primitive configuration. Given any equilibrated configuration of the Ising model, you can translate it into a primitive configuration, right? So what is the probability that this translation moves k steps to the left and l steps up? This, if it existed, would be a uniform probability measure on Z_2, which doesn't exist. This is a contradiction. The contradiction is either that I pretended that I can equilibrate an Ising model on the infinite plane, or else that I pretended that I can choose one configuration from each equivalence class of translationally equivalent configurations. But I can actually equilbrate an Ising model using only a potentially infinite limit, while to select a primitive configuration from each class requires a continuum number of independent choices. So which pretense is worse?

The theorems which are proved using uncountable choice will forever be distinguished in an essential way from the theorems proved without it, no matter what your axioms, because regardless of axioms, these theorems cannot be used for random variables. When you don't clearly put a spotlight on them and distinguish them as different, you have a headache when doing probability, in that you can't apply all the results you know about real numbers to random variables straightforwardly, so that only the study of random variables suffers, this is basically only the mathematics needed for path integrals and statistical physics, so the people who are affected are those furthest away from logic and set theory, and so in a position least able to do anything about it.

The Solovay model is different from ZFC in ways that are surprising to someone trained in ZFC, but which are 100% correct for the probabilistic intuition. An example is this mathoverflow question: http://mathoverflow.net/questions/49351/does-the-fact-that-this-vector-space-is-not-isomorphic-to-its-double-dual-requir . In this case, you can see explicity from the argument I gave that the reason that the double-dual doesn't contain more maps is because it is probabilistically inconsistent to apply these maps to random objects, so when you are choosing things at random, you can't admit more maps in the double-dual objectively than the maps corresponding to the original surprisingly small space. That's not really because you are artificially restricting the objects in the double dual, it is because you are expanding the dual! You are consistently include random objects as well as determined objects. In the same way, the Solovayan result that the dual of $L_\infty$ is $L_1$, when you admit random objects in $L_\infty$, because you can't define extra maps in the presence of consistent randomness. These results are not surprising, as when you are doing probability, you need to consider that $L_\infty$ and $L_1$ are the correct dual pair, and this is actually convenient for proofs, because it is the natural limit of the $L_p$ $L_q$ duality.

The real issue here is something that is only appreciated by set theorists. The real issue is that powersets which contain an infinite binary tree of choices can't be considered anymore as objective once-and-for-all collections with definite properties, because they always admit forcing extensions. When you force elements into such a set, you are precisely saying that you introduce elements whose logical properties correspond to those of randomly chosen elements from that set, and since an infinite list of random choices can always be made logically consistent, nobody can stop you from extending models in this way, by shoehorning new random elements into powersets or any sets of the same size as powersets. The freedom of forcing allows set-theorists to toggle between models where completely different properties are true for powersets (and only powersets), and this causes a headache to someone whose intuition is that powersets are definite objects defined once-and-for-all, because, well, are these properties true or not?

The ZFC trained intuition suffers from forcing constructions worse than it does from probability paradoxes. This is why I think that while Solovay's model is mathematically enough to make day-to-day probability work convenient, it is not politically enough to get mathematicians to switch en-masse to a new viewpoint regarding sets. The reason is that Solovay's model by itself does not provide a good consistent ontology for sets. So that there is no convenient framework which allows a mathematician to say "this statement is absolutely true of this powerset", "this is conditionally true depending on extra knowledge", and "this is false".

While ontology is silly for a positivist, it can be toggled at will so long as the foundation of the observable results of computation is preserved, it is important to have a consistent ontology for a human being to get along with a mathematical argument, otherwise it is difficult to internalize arguments.

The current ontology of sets is broken in two ways: it doesn't allow you to admit random sets or random reals in your universe, this is the problem for a statistical physicist or quantum field theorist, and it doesn't allow you to imagine different universes related by forcing as coexisting side by side, except where one is the "real" universe, and the other is some bastardized countable model that has been created only to trick you into believing it is the real universe. There is no asymmetry between the different forcing models except for prejudice. It is perfectly ok to have the Solovay model as your ontology, and consider the standard model as the bastardized one, when new choice functions have been forced in. But it is also legitimate to have a choice model as your ontology, and consider the Solovay model as the bastardization. Solovay himself said in his paper that  he considered the Solovay model as "not the real set universe" (this is pure speculation, but I believe that had he omitted this single non-mathematical statement from his introduction, he would have been awarded the second fields medal for logic--- I speculate that Cohen was unhappy that the forcing model was characterized by Solovay as less "real" than the standard model of sets. Solovay's paper introduced a now standard procedural formal mechanical presentation of forcing which has absolutely no intuitive content, and this replaced Cohen's computational and intuitive presentation, which made it completely obvious what forcing means--- shoehorning in random elements ---- and that any forcing model is philosophically as good ontologically as any other).

But I really believe that the proper ontology, while objectively arbitrary to a positivist, needs to be chosen to be maximally convenient, in that it needs allow a mathematician to toggle between the different forcing models without any philosophical difficulties. It needs to allow a probabilist to imagine that there are random variables with random values, and simultaneously allow a set theorist to well-order all the deterministic constructions that the probabalists has made, without any contradiction between the two views. This should not be a problem after a proper reexamining of the notion of powerset.

I think the proper way is to philosophically get rid of the notion of power-set as producing a fixed definite set, and replace it with a different notion of the power-set as producing any one of an infinite proper-class collection of all sets which can be interpreted as a power-set in some forcing extension. There is no single "power set" in this ontology, no one definite set object corresponding to the real numbers, there is only a forever extending family of powersets related as a poset (or as an infinite boolean algebra) by containment. Each one of these sets can be taken countable, and you can step up powersets as you like, this corresponds to a forcing extension, so long as you stop at a place consistent with the operations of ZF set theory, so that wherever you stop for a set realization, the maps revealing the countability of the powerset are not contained in the powerset of the powerset. The axiom of choice is distinguished by the fact that makes a choice for each element of a set, and when you extend the set by stepping up, you don't automatically extend the choice function (as you automatically extend unions, or pairs), so it is not invariant to the process of extension. The only two concepts in ZF which are not invariant to extension in this way are powersets and choice, and I don't think choice is the main issue, rather the powerset.

Then the "powerset" becomes a proper class worth of different objects, each of which become the powerset in a different model of set theory. The powerset of the powerset is the collection of powersets of the different set-instantiations of the powerset. You can think of this as a meta-theory of different forcing universes, allowing them to coexist, where the powersets are never fixed once and for all, and different properties are true at different snapshots of the universe, by choosing different sets to represent powersets. This is pretty much how people intuitively view switching between different forcing models in set theory anyway. You just step up different countable models by adjoining new random elements and a map from a set which identifies the labels for how you introduced these elements.

The advantage of this is that you can try to define measure theory independent of the sets, on the "true" power-object, rather than on any of the little sets that pretend to be this objects. Then you can define probability on the "true" reals, and you can interpret the choice theorems as only applying to sets modelling the reals. It's just a stupid ontology, but this makes a difference for human beings.

@RonMaimon The content of my previous comment was rather different from what you say. You are thinking about very elaborated constructions, that may be interesting nonetheless, I am talking about a very simple observation.

The Solovay model and the ZFC one are, concerning some assertions, in contradiction; in addition, some theorems of ZFC are not provable (or maybe false) in the Solovay model. This is a fact, and cannot be disputed. Another fact is that physical descriptions of the world do not use only probability theory and path integrals, but also very different mathematical constructions (and there is an equivalent, from a physicist point of view, formulation of QM and QFT that do not use path integrals so one may argue if that point of view is an unavoidable physical requirement, or just a convenient tool). I strongly believe that it would be very desirable that one could describe the physical world by means of a single coherent mathematical model, and not with two (or many, or infinite) that are in contradiction; and one has to choose the appropriate one each time to prove a result (in a setting where in the meantime other results are false).

Now, if you believe that the correct model should have all sets of reals measurable and so on, it is your choice and of course you have to be able to develop your theory in a consistent fashion, and provide the necessary results to describe the "world". Mathematicians do that using ZFC, and so are in contradiction with you, but not with themselves. And I am sure that many results of the probability theory that is done by mathematicians around the world using ZFC is correct, and takes into account the difficulties implied by the axiom of choice.

And if you prove a theorem in a model, that theorem is true, and that is by the very definition of proving a theorem. Given ZFC, and the Solovay model (or any other model you want), is up to physicists to choose the one that has the theorems that are better suited for their purposes, but they have to choose one nonetheless, for having multiple theories to describe one world is logically contradictory (and for human ontology as you call it, unacceptable). So my suggestion is: arguing on philosophy is very interesting as a hobby (and I enjoy discussing with you), but if you would like that your point of view become a valid mathematical alternative you should make axioms and prove new theorems (within your framework, and that means you have to redo most of the proofs). If the proof is mathematically acceptable, those theorems by themselves would be of interest in mathematics. To make the physicists (mathematical physicists at least, that use mathematical rigor) shift from the model of ZFC to your model, you have to prove a huge amount of theorems, namely all those that they use in physical modelling. If else, they would stay with the well-established and very powerful ZFC, for many many results are already available there.

@yuggib: I did not misinterpret your comment--- your new comment is saying the same thing "mathematicians are used to ZFC and don't want to change". This is an argument from tradition, and it fails, because tradition is failing terribly with measure theory.

In current mathematical practice, there are two completely incompatible conventions, both deeply embedded. It is conventional to assume the axiom of choice is universally true when doing set theory, and it is also conventional to simultaneously inconsistently assume, only when doing probability, that random variables can be spoken of as taking on values. This latter manner of speaking is so intuitive and convenient for describing arguments, that it is impossible to get probabilists to stop doing it, that's what they are doing in their heads to produce and internalize theorems, and that's how they speak about these theorems when they prove them. But the disconnect with set theory means that any such proof in probability cannot be made rigorous as it stands, but needs a relatively heavy slog to turn it into a statement of a different kind about sets, which does not at any point speak about random variables having actual values, because they can only be spoken of as members of a subuniverse of measurable sets. This procedure is extremely time consuming, it makes a disconnect between probability discussions and rigorous proofs, and because what you are doing in your head is completely different from what you are doing in the formal procedure, it is error prone, and can produce false proofs. The results are usually true anyway, because there is nothing really wrong with imagining random variables have values in a Solvay model.

In practice, it is very easy to get people to shift underlying frameworks, after you do a certain amount of work, because people don't use frameworks, they use theorems. The framework is like the operating system, you can invisibly insert a new one in when you can emulate the previous system, that is, when you include the usual system inside.

The political problem with Solovay's model, as you said, is that it is incompatible with some of the theorems mathematicians are used to, it doesn't immediately include the old system inside. This makes people wary, because they don't know what is true and what is false regarding the model anymore. But the old theorems are still valid in a more restricted sense, using dependent choice, and you need to keep track of this anyway, for probability.

But mathematicians would still like to know what the Hahn Banach theorem means in an expanded sense, or the prime ideal theorem. I am saying that you can interpret them in an expanded sense also, even in an imagined universe where all subcollections of R are measurable. But you need to be careful to talk about power-sets as incomplete collections. This is useful for forcing also, because right now, set theorists don't have a consistent single picture of the universe that they can work in, they use different pictures for different models of ZFC.

When you switch foundational stances, you don't have to reprove all theorems, you just have to show how the old system embeds in the new, and there should be no difficulty to embed the old in the new when you do things right. The Solovay model should have been close enough, but it hasn't been. I don't believe it is feasable to convert the arguments regarding path integration into theorems if it is not immediately apparent how to deal with infinite collections of random variables without contradiction. Saying "just pretend they don't have values" doesn't work for path integrals, because the arguments are too elaborate and require using properties of the distributions you produce in the limit. You can always limp along, but mathematics progresses not so much by finding better formal methods, but by making intuitive shorthand for methods that can be made formal, so that more complicated formal methods become easier to describe. I don't see any hope for the current probability formalism, as it will never allow you to speak about randomness naturally.

@RonMaimon What I am saying is that mathematicians are used to ZFC but they may change without much effort, for the mathematical results are theorems proved within a given logical system (that most of the time is ZFC but not always, the important thing is that it is clear what it is). The point is what physicists would like to do, which model they consider more fit to their needs to model reality.

And what you say about "embedding" theorems from one model to another, this is simply not true. Given a theorem of ZFC, it is true that a weaker version may hold (it is not assured, in the book about the axiom of choice by Herllich you can see explicit counterexamples) in ZF+(some weaker choice). There are theorems that no matter how weakened cannot be proved in ZF+(some weaker choice). And the Solovay model is not just ZF+DC+(large cardinal), but a model where some true statements of ZFC are false. There is obviously no hope of embedding them, for they are false! And it is very difficult to keep track of what is false in these "forcing models" that are in (partial) contradiction with ZFC: in fact, even if they have been introduced more than 50 years old, and people is well aware of them, but no "embedding rule" has been produced (while for example there is one in quantum logic and quantum set theory wrt ZFC, and that theory is far more recent). Finally, take into account that when dealing with advanced mathematics is already very difficult to understand if a theorem of ZFC is true in simple ZF+DC (the equivalence between a theorem and the axiom of choice or some of its weaker versions has been proved only in few instances, you can see many of them in the book I mentioned before); and I repeat that a given  theorem may be not only unprovable but false in the Solovay model.

To finish the discussion (at least by my side): I agree that these models and possibilities are intriguing and interesting; simply for now they are nothing more than a mere curiosity for mathematicians, and not so useful for physicists because so many important results are lacking in this context. And I am not the advocate of ZFC, first of all because it does not need me as an advocate, and also because the logical models are more a language to do proofs than entities by themselves in my opinion, so I don't think one is better than the others, but just different.

@yuggib: What you are saying is what mainstream non-logician mathematicians believe. I disagree completely, and I believe even a small amount of experience with forcing will change your mind as well. I recommend the recent book "Forcing for mathematicians" by Nik Weaver, it is the best treatment of the subject I have read, it follows Cohen closely, except incorporating the insights of the next 40 years. It is the only really nice treatment since Cohen, this might be sacriledge, but I think it might even be better than Cohen.

You don't throw away any theorems when you change your model using forcing, you keep all the theorems, but you change the interpretation, so they apply to a different domain of discourse. I agree that mathematicians don't know how to do this, but it is because there has been no unified presentation of a formalism which makes it easy to do. Nik Weaver's ZFC+ is a tiny step in this direction, but I think this is what I think one should do, make a unified formalism that makes it easy to switch set models at will, because this is what is needed now. It is in practice impossible to ask a human being to do probability in a set theory that insists that 100 mathematicians can predict the unknown real number hidden in a box they haven't opened with a success rate of 99/100 (see here: http://math.stackexchange.com/questions/371184/predicting-real-numbers  ). This is just false in every practically useful meaning of the word "false", and there's no ifs ands or buts about it. It is not only an intuitive fact, but it is a mathematical fact, that you can put unpredictable numbers in boxes, and any idealization of infinity that doesn't let you do this is broken beyond repair.

For example of interpretation shuffling, withing a Solvay model, if you have a theorem, say "Every real can be expanded in a basis when R is considered as a vector space over Q", this theorem obviously becomes false in the Solovay model, when it is stated for all the reals. But it is still true for a subcollection of the reals, namely the reals in the old model you started with (assuming that old model obeyed choice), which is still sitting in there! You can interpret the theorem as applying to those old-model reals, and not applying to the new Solovay reals, and the reals produced from those. The toggling in forcing is just adding certain generic elements to power-sets, it's a process of adjoining, so it doesn't make the theorems false, they just hold for the subcollections from before you did any adjoining. This is not natural to say in normal set theory, because you don't think of different sets as capable of modelling the reals, you don't think the reals can change around by someone forcing in new ones. But logicians got used to this already a long time ago.

Because it's all interpretation switching, the problem of switching foundations is mostly an issue of pointless philosophy, rather than mathematics. The only reason to go through the trouble is if you get something out of it, and at the moment, not many people see that you get out more than you put in (some logicians do).

In the case of mathematical physics, reliant on probability as it is, you gain an enormous amount, because there are natural arguments that are difficult to make rigorously in the current formalism, precisely because of these measure theoretic headaches. I will give some simple examples:

1. Free fields. To define a free field, you just pick a Gaussian random real for each k with appropriate variance (1\over k^2+m^2), and Fourier transform. That's it, at least in the Solovay model. The map takes the complete Lebesgue measure to a measure on distributions which is the free-field measure. I'll leave you to check the mathematics literature for the construction of free fields, and I assure you that that's not it in the mathematics literature, because you have a huge headache to show that you can construct a sigma algebra on the space of distributions, and all the maps are measurable, even though you used no uncountable axiom of choice, nothing, so it should be automatic. See references in Sheffield's work for the construction.

2. there is a very useful and very fundamental notion in probability of "coupling". You can define a certain coupling as a process of taking two random walks, say two Ising models under certain dynamics, and making them coalesce by sticking together when they meet. To do a coupling in probability, you usually take two initial configurations, and show that they collide after finite time. If you prove this for all initial configurations, you are done, you have shown that the walks couple. But in standard mathematics, and when the domain is infinite (as for example in Hairer's considerations regarding coupling PDE's), you can't do this at all (even though the results are still true). Proving that something holds for all given "instances" of a random walk doesn't tell you anything at all, precisely because the notion of "instances" is not compatible with.the axiom of choice.

For example, use the axiom of choice to produce a list of configurations which are primitive for translation (meaning, one configuration for each translation equivalence class of Ising models). Now for any one configuration of the Ising model, I can define a translation dynamics which takes it to the primitive configuration in a finite number of steps of translation. Does this mean I can define a dynamical process which translates any randomized Ising configuration into a primitive configuration? No! Absolutely not. In standard mathematics, this property of being finite distance away from a primitive configuration is true only on _instances_, it can't hold for random Ising model configurations, it doesn't make sense to ask it about random variables.

3. In Jaffe or Balaban constructions of Gauge theory, there is a notion of cluster expansion which is entirely uniform on infinite volume lattices, meaning the cluster expansion doesn't care whether the volume of space is large or infinite. These cluster expansions, however, are only defined on finite volume in Balaban, because there are "technical issues" with infinite volume lattices. These technical issues are issues of measure, and they disappear when you don't have to worry about measure paradoxes.

I could go on all day, but I can assure you that the loss from rejecting those theorems that require uncountable choice is not only negligible, it is negative, you actually gain more precision by being consious of the theorems that use it. The gain is entirely on one side, and the loss entirely on the current side. This is why I will never change my mind.

@RonMaimon: The problem is that once you change the interpretation as you want to, the old ZFC theorems no longer apply to your random variables, which are outside the protected subdomain in which the ZFC proofs remain valid. Thus you have to reprove every result you use on your random variables, since the ZFC proofs don't apply to them. This is a lot of work, and is (I guess) what @yuggib meant. One uses without thinking a huge number of facts that all become uncertain to any ZFC-based mathematician when using them on your random variables! Unless you can tell us where they are proved on your new basis.

So far all your aguments amount only to handwaving that there should be no problems in reproving the part needed outside. And as you can't refer to a source where all this is actually reproved for the instances you need to define and use your constructions in a real physical context, its just conjecture, not fact.

@ArnoldNeumaier: The notion of "random variable" is not something I made up, it is defined standardly in mathematics, usually as the measurable functions from the interval to the set X where the random variable lives. Mathematicians use random variables all the time, and they imagine these are "random picks" from the set X, so they often view the random variable as having an actual value in X, whatever x is. 

So mathematicians already have to keep track of which theorems use choice and which don't, because when they pretend the random variables are allowed to have precise values on an infinite set, and then do things to these precise values, and then work backwards to find the new random variable describing the result of the operation, if they used a bad theorem, they get burned, because the operation produced a non-measurable map, the result is no longer a random variable. So the probabilists already need to know exactly which operations on X produce non-measurable sets, because these operations are forbidden of random variables.

So the difficulty is already there, just everyone basically ignores it! People use whatever operation they want on the value of a random variable, and only if they hit a contradiction do they go back and say "oh, yes, this produced a non-measurable set". It is not so easy to avoid the contradictions, this is why probability is a pain in the ass.

The point of a measurable universe is that every function is measurable, so all theorems that are true of all x in X are then also automatically true of all random variables x taking values in X. So the theorems you prove in a Solvay universe are universally applicable to everything, deterministic and random objects both.

But the problem is indeed that it is uncomfortable to speak about a universe where you aren't sure what theorems apply. And also, I agree that the translation procedure is time-consuming and a good formalism for translating theorems to different universes has not been constructed. But how the heck is anyone going to define a path-integral rigorously when you can't even speak about a randomized Ising model configuration on Z_2 without worrying that you will hit a paradox? That's what people are struggling with right now. Maybe they'll do it, but it will be a 1,000 page nightmare instead of a two page proof.

@RonMaimon: As you emphasized, the ZFC-theroems are only valid for a subset of reals in your modle, while the remainder are random variables, where you therefore cannot apply any of the ZFC theorems available in the literature. You need a new set of theorems about what is valid for these not-ZFC numbers. If you simply use ZFC theorems for these, you commit a logical fault. And to reprove everything in your setting will be a 1,000 page nightmare instead of a two page proof.

by the way, mathematicians have no difficulties talking with full rigor about random processes on an Ising model, and proving results about these. it is quite easy to keep everything needed measurable.

@ArnoldNeumaier: First, I personally don't have a model. Solovay made a model a long time ago. I would like to make another model, because the first one is not easy to understand. The comment about reproving theorems is nonsense. You don't have to reprove anything, you reinterpret the content depending on the argument. Since everything practical is proved from dependent choice already, you hardly ever make a continuum of choices in a normal proof, you get the stuff physicists normally use for free. The few things which aren't proved from dependent choice are true in a restricted domain, and you need to keep track of the domain of applicability in a Solovay universe. But the point I am making is that you need to keep track of these theorems anyway because as a probabalist, these are the theorems you can't apply to random variables. The combination of the fact that nobody uses continuum choice really, and that probabilists already stay away from choice by upbrining is why you don't see measure paradoxes all that often. But they're still there. making the probability arguments difficult. The problem didn't go away. And it makes it extremely difficult to construct a path integral, because you can't speak naturally about the simplest constructions, you need someone to certify you to speak about a free field, which is just a random lattice of Gaussian random variables.

When you introduce a good foundation, you get to keep all the old theorems that are intuitively true, you only reject the ones that are intuitively false. Probablists don't go around proving intuitively false theorems (at least not in a way that contradicts probability intuition). A good foundation allows you to see automatically when a theorem can be made rigorous. The gain from a different foundation is that you can easily say things you couldn't before, and constructions become intuitive, automatic and simple.

You think it is easy to speak rigorously about Ising models, because there is a large literature on Ising models involving lots of people. The problem is that there are operations on the Ising model configurations which don't make sense in traditional set theory when applied to random configurations, and it is very easy to construct them, I gave an example above, with translationally primitive configurations. The people who work with Ising models simply don't consider such constructions, and they never will. But they could run into one of them if they used theorems people proved in other contexts about Ising model configuratons (but there aren't any), and they would get burned every once in a while.

The reason measure paradoxes don't come up is by convention and by ignoring the problem. They are a barrier to rigorous physics, and have been since they were first constructed.

@RonMaimon: You may think it is easy or trivial, but it isn't unless one checks for every result used the whole chain of proofs going into it to find out whether it can be proved without full choice. This is easily thousand pages of work, and needs special concentration for people who usually use choice without thought.

I know of no source of real analysis and functional analysis that would tell me which statements are provable using countable choice in place of full choice. Without such a source (that exists long enough so that enough people checked it for accuracy) even the simplest statements become uncertain. Even the existence and uniqueness of the field of complex numbers is no longer clear to me, for at the time I had studied various proofs I didn't mark choice as special, hence have no memory of how often and how essentially choice is used.

Your claim that one hardly anywhere uses the full power of the axiom of choice is just a claim, without reference where it is actually verified it carries no force - especially since there are exceptions and there is nowhere a full list of exceptions. 

No one apart from you is willing to proceed on such an uncertain basis.

@ArnoldNeumaier: Verifying which theorems depend on full choice and which are true using depedent choice is very simple when you are used to it, and it has already been done already by lots of people, including me, although you don't bother to do it. It is just not true that people don't bother to keep track, lots of people do. People who do probability absolutely need to keep track of the theorems which use uncountable choice, because these theorems produce non-measurable functions, and if you don't know this, you either have to reprove every theorem for random variables using a separate argument, as is traditional, or you will make probabilistic contradictions. The probablists don't make contradiction because they "feel" which theorems use continuum choice, because these theorems produce unreal dusty collections, and you just avoid any collection you can't visualize and you are not using choice. But this is not a good method, as it doesn't formalize the way to avoid wrong theorems.

There are very few results which use anything more than dependent choice, and I am not at all the first person to say this. Solovay and others noted this in the 1970s, and this is exactly why Solovay's model is so exciting, it allows the development of measure theory and the appropriately useful part of functional analysis within a set theory which is compatible with probability. The existence and uniqueness of the real and complex numbers doesn't even use dependent choice, as real numbers are defined as Dedekind cuts of rationals. Choosing a sequence from an arbitrary sequence of nonempty sets requires dependent choice. Choosing an ordinal sequence of uncountable length from an uncountable sequence of nonempty sets requires the full axiom of choice. Whether you personally kept track of uses of continuum choice is not important, as others do so, because they know the uncountable choice theorems are not true in situations involving random variables. This is why choice is singled out as an axiom and people grumble about it. 

I can't convince you until you see a good new system. I will post it in a self-answering question. You will see how it works, and how different it is regarding probability constructions. I am pretty sure I have it now. I don't think it is easy or trivial, I think it is relatively difficult, otherwise it would already have been done. Whether other people are willing to follow is not important to me, I want a system for my own use in proving theorems, not for other people.

You can't convince me until you have shown me a book which explicitly does it. Otherwise each researcher has to do it again from scratch to make sure which ones are the exceptions. 

When I am using random variables I can use their conventional definitions as functions and can apply theorems about their realizations, which are ordinary numbers. This is enough to do everything I ever needed to do working in stochastic processes, and I am sure most probabilists have the same attitude. This is fully rigorous. 

@ArnoldNeumaier: The problem with the "realization" language as formulated in standard ZFC measure theoretic probability is that there are operations you apply to "realizations" in ZFC set theory which cannot be applied to the random variables themselves, as they no longer are random variables once you apply the operation (the operation takes measurable functions to nonmeasurable functions). Since you can't apply the same operations to random variables as to realizations, it is therefore incorrect and non-rigorous to view the realization as corresponding to the intuitive concept of "a representative pick from the random event" when you had an infinite sequence of random events, because if it were so, if it really corresponds to "what did happen in a particular event", you would be able to apply an arbitrary operation to this final result.

The example I gave is very simple--- you have two Gaussian random variables x,y, and z which is (3x+4y)/5. In any "realization" (in the ordinary meaning), the value of x can be decomposed into a basis when R is considered as a vector space over Q, and y can be decomposed into the same basis. So there are m(x) and n(y) basis elements for x and y. But the function m(x) and n(y) don't make sense as operations on the random variables, they only make sense as operations on the "realizations". So the realization can never be really realized, the operations on the random variables are restricted.

The restriction of the operations defined on random variables to a strict subset of the operations which are defined on the realizations leads to mistakes, and it is a very awkward formalism. The reason is that the intuition considers that you complete the random process, and produce a realization, but this intuition leads you to think that you can then apply any operation, and the result "must be" a new random variable which is the result of this operation. But in set theoretic probability, this only applies to measurable operations. This means you need to single out those processes which produce non-measurable operations, and keep them in mind, as these you can't apply to realizations, at least not while thinking you an apply them back to the random variable.

People tend to ignore this annoyance all the time, because it doesn't accord with the intuitive picture, and because the nonmeasurable operations hardly ever show up in practice, because nobody does probability and uses uncountable choice at the same time. So probabilists pretend that any operation they can do on realizations is ok to consider being done on random variables. This is not a rigorous viewpoint. The operations which are ok on random variables only is the same as the operations which are ok on real numbers in a model where every subset of R is measurable.

This sounds like an abstract concern, but it isn't. I gave several examples of operations you apply to realizations which don't make sense to apply to random variables. This is a monstrous thing, and it can be avoided, although nobody gave a reasonable formalism to do so in all cases, not even Solovay.

One can (and does) apply to random variables rigorously everything that one is allowed to apply to measurable functions, and gets in this way a fully consistent ZFC-based theory of stochastic processes. With any Borel measure, continuous functions of measurable functions are measurable, so one can apply rigorously everything to most hings of interest without special care. Care is needed only when applying operations with discontinuities, and then care is indeed taken by those who want to be fully rigorous.

That you find this way of proceeding awkward is your personal preference; others like me prefer the rigor it offers without having to rethink (and recheck) all classical mathematics in terms of nonstandard foundations, and the fact that the results can be easily communicated to fellow mathematicians.

@ArnoldNeumaier: That's the ZFC way. It is very cumbersome, and you only realize it is suboptimal when you have something else to compare to, where you don't ever have to worry about "non-measurable transformations". One can apply any operation to the phony  ZFC "realizations", but they don't lift back to operations on random variables. The only operations that make new random variables from old random variables are those operations that are measurable. This makes an impossible barrier between intuitive arguments, where you are allowed to pick as many random variables as you like, and do whatever you like with them, and rigorous probability, where you are only allowed to do measurable transformations to the random variables, and the two concepts "any transformation" and "measurable transformation" only coincide for random variables defined on a finite or countable set, i.e. when you are choosing integers at random, you're not choosing random real numbers, or random Ising model configurations.

In reality, anything which is not constructed using uncountable choice is perfectly ok to apply to random variables, and the result produces a new random variable. Anything at all, except for counterintuitive monstrosities which are already repugnant to the intuition, and best considered as nonexisting. The reason they are repugnant to intuition is precisely because our intuition for real numbers includes random choices from the interval [0,1], this is the only conflict between the axiom of choice and intuition, every other intuition failure can be traced to this.

But you see that what you are comfortable with claiming are only continuous transformations! This is such a tiny subset of "all transformations", it's like saying in Greek times that you are comfortable with a result on real values only for rational values, and not for irrational ones. Measurable transformations are much closer conceptually to "every transformation" than to "continuous transformations" or "continuous transformations with controlled singularities", despite the ZFC intuition.

This restriction comes from a clash between the philosophy of infinity in ZFC and the intuition people use for probability, and this makes for enormous complications when you are dealing with a path integral. In defining a path integral, you are expected to be able to (at least) generate countably many random real numbers and do whatever you like with their values, anything you can define, and the result of whatever you did then automatically defines a path integral on the resulting space you map the values to.For example, I'll give the proper intuitive definition of the probabilist's Wiener Chaos, the physicist's random noise field $\eta$, which is defined as a distribution on $R^d$, i.e. a map from any test function to a (Gaussian) random variable which can be viewed as the integral of the test function against a random Gaussian real of size $\epsilon^{-d\over 2}$ on an $\epsilon$ regularization of $R^d$.

An intuitive physicist definition: for each point on a lattice in R^d, generate a Gaussian random real number, then on a lattice twice as fine, generate 2^d Gaussian variables whose sum is the same as any obvious projection between the finer lattice and the coarser lattice, and continue countably many times for all lattices which are twice as fine again. This single countably infinite choice of Gaussian random variable defines variables whose correlation on any lattice vanishes between any two lattice points, and whose sum over any ball is convergent. It therefore defines a map from any test function to a Gaussian random number, defined as the limit of the sum of the lattice values times the function values, as the lattice becomes fine. Since it is convergent with probability 1, it defines a map from test-functions to reals, and therefore it defines a random distribution, and this is what $\eta$ means.

To make this argument rigorous should be trivial--- you should be able to make the argument exactly as above, or with minor language modification, and it should be considered rigorous as is, because there should be a main theorem somewhere which says "any function which takes countably many random reals and produces a distribution from their values with probability 1 defines a measure on the space of all distributions". This is "any function from the realizations of random variables to distribution which, with probability 1, defines a random distribution, and so is in the space, defines a measure on the space of distributions". With this fundamental theorem, that's it. You're done with the construction of $\eta$, and now you can focus on what you are interested in, namely the solutions to stochastic differential equations with random noise, using the realization of $\eta$, or you can use Levy flights to generalize the construction trivially, and so on. There is no difficulty at all, this intuitive construction is correct rigorously.

But there's no way to do this in ZFC, because the fundamental theorem is false in ZFC. It is not true that any function from realizations of random variables to distributions defines a measure on the target space of distributions, because among these functions, there exist nonmeasurable ones! The desired proof considers the realizations, the values of the random variables, then does arbitrary stuff on them, using all the properties of the realizations which are true with probability 1, and then lifts the results to define a new random variable. This is disallowed in ZFC, because you need to prove that the resulting function from the realizations to the distributions is measurable. How the heck do you prove that? It is a headache! You don't have the space of distributions a-priori, with their measures already given, and the open sets already known, you construct these distributions from the probabilistic construction I gave, and it is annoying as all heck to have to work backwards from the space of distributions (which you don't know a-priori, you figure it out from which test functions produce consistent results in the limit), and show that the map from realizations to distributions is measurable.

So the intuitive argument is currently disallowed in mathematics, not when it isn't supplemented with a complicated song and dance that exposes the guts of the set-theoretic machinery underneath. If you present this construction, it will be called "nonrigorous", or "intuitive", and this is the most elementary random construction that there is. This construction is exactly what Einstein and Smoulochowski had in mind for Brownian motion, or Parisi had in mind when doing Stochastic Quantization. You can easily prove the properties of Brownian motion from this construction, if you use the values of the realizations and the explicit limit, but the set theory forces you to focus on the most irrelevant things, like the open sets on the space of distributions, and their countable unions and intersections.

I simply can't accept any proof of the existence of Wiener noise which is substantially different than the construction I gave above. The intuition for Wiener noise is the construction above, and there is nothing logically wrong with it, nor is there anything logically wrong with any other construction which does countably many random picks from the real number line. The Solovay model allows you to say it this way, because the Solovay model allows arbitrary countable constructions and makes every map measurable. But that's not an optimal solution, I think, because Solovay's model is also intuitively wrong in other contexts.

The point is that you need to be able to talk about constructing maps from realizations of random variables and then lifting the results to constructing a measure whenever you can prove that with probability 1, the construction lands in some set S. This theorem is alien to ZFC, because in ZFC, even maps from R to Z can't be lifted from realizations to random variables, there exist nonmeasurable maps from R to Z in ZFC, like the number of basis vectors you use in a decomposition of R over Q.

@RonMaimon: The Solovay model allows you to say it this way.

But to call it rigorous requires that someone has first reliably (and with the agreement of the mathematical community) checked that thousands of other little arguments needed to build a complete theory of stochastic processes are indeed (as you claim without pointing to a proof) valid in the Solovay model. Until someone writes a book on stochastic processes where this is carefully done none of your above arguments is mathematically rigorous. 

Moreover, your arguments do not apply to constructions where the operations are truly singular - then you need to analyze the singularity even in the Solovay model. But quantum field theory is full of singularities, and it is these that cause the trouble, not the lack of measurability.

By the way, you consistently write Weiner in place of Wiener - please edit this in your posts to this thread.

@ArnoldNeumaier: Without the nonmeasurable sets, there just aren't thousands of little arguments to do regarding any ordinary probability stuff, after the set theoretic headaches are resolved, there's nothing difficult left. The construction of white noise, or of free fields, or models which are defined by a precise Nicolai map, or stochastic equation requiring no renormalization, or continuous limits of Levy variables (this is not done yet AFAIK), or defining Ising model measures, or whatever always excluding renormalization,  is as straightforward as in elementary calculus. This simplification of ideas in intuitive probability is precisely why they put nonmeasurable sets on page one of any rigorous probability book, it's there to scare students away from intuitive probability, as mathematicians were scared in the 1920s, so that they think intuitive probability is logically inconsistent (false), just because it is inconsistent within ZFC (true).

Using the intuitive arguments, elementary probability is a piece of cake, which is why physicists were able to relatively easily and correctly develop sophisticated path-integral constructions of interest to pure mathematicians usually without ever taking a single course in measure theory. It's not because the physicists are using "heuristics" or "imprecise constructions", it's because the path integral belongs to intuitive probability, which is perfectly logically consistent, and mathematicians can't handle intuitive probability. The Axiom of Choice was just a gift from pure mathematics to the physics department, it means the mathematicians have to play catch-up for nearly 100 years.

The intuitive argument in probability go like this: for example, to "prove" that all subsets of [0,1] are measurable: consider a set S in [0,1] and sample [0,1] countably many times. For each sample, determine if it is in your set or not, and define the measure to be the limit of the number of generated random picks that land in S over the total number of picks. This argument is obvious, and it works intuitively and one feels that it "should" work as a rigorous argument in some way, but it can't be made rigorous in ZFC. It is circular in ZFC, because the definition of random variables is through measure theory, and the end result of translating the argument just becomes a justification for using the word "measure" as "probability". This non-argument then only works to show that measurable sets have the property that random variables land in them in proportion to their measure, or rather, more precisely (since only measurable sets can be "landed in", so the concept of "landing in" is not precise in ZFC) that the probability of the event of a random variables being an elements of a measurable set S is equal to the measure of S. But this random-picking argument is not circular (or at least not obviously circular) when the definition of random pick is by random forcing, and you have some sensible external framework in which to speak about adjoining new elements to R as you randomly generate them, and a good characterization of how to assign them membership to sets predefined by their generative properties (so that you understand what the "same interval" means in different models--- this is an essential part of Solovay's construction).

Continuing in this manner, if you have any way of taking countably many random variables (in a sensible intuitive probability universe) and forming a convergent sequence of distributions, you define a measure on any set of distributions which includes the image of the random variables with probability 1. The measure of any set X is just the probability that the construction ends up in X, which can be defined by doing it again and again, and asking what the fraction of throws that land in X is. This type of construction obviously doesn't qualify as a rigorous argument when there are nonmeasurable sets around, but it is completely precise when there aren't.

Each separate construction is one simple argument which requires no effort to remember or reproduce. It's like when you develop calculus, you don't need a separate argument with estimates for Riemann sums for the integral of 1/(x^2+1) as opposed to 1/x. You use a unified formalism.

For me, I don't care whether a construction is certified by a community as rigorous. I call a construction rigorous when there is a reasonable formal system which corresponds to some model of a mathematical universe, an equiconsistency proof of that system to a well-accepted one or a reasonable reflection of it, and a sketch of a proof that the formalization would go through in this system. The system I have in mind is not really Solovay's, because Solovay is complicated. But let's take Solovay's to start, because it works for this type of stuff, ZF+(dependent choice)+(Lebesgue measurability of R). Also it exists for sure, and is known to be equiconsistent with a well-established relatively intuitive large-cardinal hypothesis that is in no way controversial (possibly only because Grothendieck Universes are not controversial now, maybe because people get large cardinal hypotheses better, I don't know). I don't say this is the optimal model because Solovay's model is complicated to construct, and complicated to prove stuff about, because of a silly issue in the construction regarding collapsing the inaccessible, which is only required to be consistent with the axiom of power set.

I prefer a completely different system, which I just made up, which replaces the axiom of powerset with a different, to me more intuitive, thing. The basic idea is to make all countable models of ZFC (normal ZFC, including powerset), and speak about the universe as a container set theory for all these countable models, where the powersets are all proper classes containing all the powersets of all the different models. Then the notion of measure is for the class not for the sets (although there is a notion of measure on the sets also, as they are consistent with ZFC). The ambient theory has the axiom of choice, but it doesn't have an axiom of powerset, so it doesn't lead to measure paradoxes. It is just a theory of countable sets, so there is no issue with choice leading to nonmeasurable sets. The uncountabiltiy of R is the uncountability of R as a class, not as a set. I'll explain it in an answer to a self-answered question, but since I know it's not already out there, I want to run it past a logician that hates me first, because they might tear it apart correctly.

@RonMaimon: I'll wait until you have written a thoroughly refereed book showing how to develop stochastic processes your way and with full rigor.. The traditional way is not so bad as you think - thousands of mathematicians learnt the ropes and it is a very thriving discipline.

@ArnoldNeumaier: Okay, fair enough, that's how you feel, but you must remember that thousands of physicst didn't learn the ropes and they end up producing results of more value than the mathematicians that did. Those thousands of mathematicians are proving extremely primitive theorems compared to the state of the art knowledge in physics, and their methods are unreadable and arcane to anyone who doesn't learn their lore. I don't accept a situation where a good intuitive argument is used by thousands of people, is logically consistent within its own universe, but is not considered rigorous because of conventional universe. I insist that the convention must change to allow the intuitive argument to go through, and I will write whatever I need to make it happen if I can find it within my limited strength.

That's exactly why there are two distinct theoretical communities in physics - the theoretical physicists and the mathematical physicists. 

The physicists you mention produce valuable results at the level of theoretical physics but only conjectures at the level of mathematical physics. I have nothing against doing theoretical physics at a formal nonrigorous manner, but it shouldn't be called rigorous before it isn't.

Conventions are a matter of social agreement, hence involve many people, not just one who is discontent with the existing conventions. If you want conventions to change you must write enough papers and books that demonstrate to the satisfaction of others in your target community that what you do is fully rigorous on the basis of the Solovay model. As mentioned already, it will probably mean 1000 pages rather than 20 paragraphs of handwaving. It is only this sort of insistence that changes conventions (and even then only in the long run) - continually writing papers that drive home the point, in a convincing way.

@ArnoldNeumaier: I do not accept that mathematical physics can operate with the current framework of probability. You simply have no idea how simple probability becomes in a universe where R is measurable, and once you see it, even a little, you can't go back. It's like unlearning calculus and going back to Riemann sums. The constructions in measurable universe bear more of a relation to the arguments of theoretical physicists than to the arguments of mathematical physicists. It's the mathematical physicists who are doing things all wrong in this case, not the theoretical physicists.

Many mathematicians, perhaps a large minority, perhaps a majority, are already dissatisfied with the inability to speak intuitively about probability in ZFC. This is reflected in the set theoretic paradoxes I linked, which you haven't read--- these are all conflicts with intuitive probability (for example, here: https://cornellmath.wordpress.com/2007/09/13/the-axiom-of-choice-is-wrong/ , related articles are "Axiom of Choice and predicting the future", and a linked mathoverflow question regarding an analogous even more counterintuitive issue with mathematicians unjustifiably guessing one of countably many real numbers in boxes put in front of them). These probability paradoxes are the main intuition conflict that led to issues with choice. You can read grumbling about it all over the literature. Conway complains about non-measurable sets in his book on Real Analysis. Connes complains about the fiction of non-measurable sets in his book about noncommutative geometry. Lebesgue complained about them in the 1930s until he died, and Cohen's forcing is accompanied by clear intuition straight from intuitive probabality. Solovay's paper by itself was nearly enough to completely change the consensus in 1972, there was a (small) movement to do it in the 1970s, and it didn't gather momentum, and Solovay's paper did not run to 1000 pages, it was a dozen or two relatively difficult pages (but they've been internalized and digested now, so that they're easy).

Why didn't the "revolution" complete? Why haven't all the measure theory books been rewritten so that every set is measurable? I believe for two reasons--- first, Solovay politically sold out a little, and explained in his introduction that "of course, the axiom of choice is true", and then gave some pieties to support conventional wisdom. He was not seeking a clean break with set-theoretic measure theory, like Lebesgue, Connes, Conway, and everyone else who knows intuitive probability. He was motivated simply by the desire to find a nifty model of ZF+DC, not by the desire to give a better model of "mathematical reality".

The other reason is that Solovay's model is not exactly a model of naive probability either. Sure, every subset of R is measurable, and the foundations of real analysis and functional analysis go through, so you can pick countably many numbers in [0,1], pick countable sequences of random variables and lift results from realizations, so that you can do the ordinary stuff physicists do for a path integral, but there are also intuitions that fail. Uncountable ordinals don't embed in R, the theory is still ZF underneath, so the intuition from intuitive probability that R is enormously big, bigger than any ordinal, is not preserved in Solovay's model, it can't be in any model of ZF.

For this same reason, preserving powerset axiom, his construction is extremely complicated to follow, it requires collapsing a whole universe to consist entirely of countable sets (this is a Levy collapse of an inaccessible to omega-1, it was explained to me recently what this collapse does), and this required a large-cardinal hypothesis. The large-cardinal is overkill for intuitive probability, you don't need it. You need it to make the powerset axiom true, because now the powersets and the uncountable ordinals are totally separate sequences (and you need powersets for the ordinals, and so on). This is the real headache--- staying consistent with powerset.

The results of mathematical physicists are unacceptably weak, and unacceptably obfuscated. I will repeat, there is nothing wrong with intuitive probability, it is all just fear mongering. There is nothing really wrong with the axiom of choice either. it's the axiom of powerset that is causing difficulty, and this axiom is used to define R as a set, and to define functions, and so on. The whole point of redoing the foundations is to make R a proper class, as advocated recently by Nik Weaver in his forcing book.

+ 0 like - 1 dislike

...He derives the path integral and shows it to be: $$\int_{q_a}^{q_b}\mathcal{D}p\mathcal{D}q\exp\{\frac{i}{\hbar}\int_{t_a}^{t_b} \mathcal{L}(p, q)\}$$

This is clear to me. He then likens it to a discrete sum $$\sum_\limits{\text{paths}}\exp\left(\frac{iS}{\hbar}\right)$$ where $S$ is the action functional of a particular path.

Now, this is where I get confused.

At this point I think it will be helpful to make an analogy with an ordinary Reimann integral (which gives the area under a curve).

The area A under a curve f(x) from x="a" to x="b" is approximately proportional to the sum $$ A\sim\sum_i f(x_i)\;, $$ where the $x_i$ are chosen to be spaced out from a to b, say in intervals of "h". The greater the number of $x_i$ we choose the better an approximation we get. However, we have to introduce a "measure" to make the sum converge sensibly. In the case of the Reimann integral that measure is just "h" itself. $$ A=\lim_{h\to 0}h\sum_i f(x_i)\;, $$

In analogy, in the path integral theory of quantum mechanics, we have the kernel "K" to go from "a" to "b" being proportional to the sum of paths $$ K\sim\sum_\limits{\text{paths}}\exp\left(\frac{iS_{\tt path}}{\hbar}\right) $$

In this case too, it makes no sense to just consider the sum alone, since it does not have a sensible limit as more and more paths are added. We need to introduce some measure to make the sum approach a sensible limit. We did this for the Reimann integral simply by multiplying by "h". But there is no such simple process in general for the path integral which involves a rather higher order of infinity of number of paths to contend with...

To quote Feynman and Hibbs: "Unfortunately, to define such a normalizating factor seems to be a very difficult problem and we do not know how to do it in general terms." --Path Integrals and Quantum Mechanics, p. 33

In the case of a free particle in one-dimension Feynman and Hibbs show that the normalization factor is $$ {({\frac{m}{2\pi i\hbar\epsilon}})}^{N/2};\, $$ where there are N steps of size $\epsilon$ from $t_a$ to $t_b$, and N-1 integrations over the intermediate points between $x_a$ and $x_b$.

Again, quoting from Feynman and Hibbs regarding these normalization measures: "...we do know how to give the definition for all situations which so far seem to have practical value."

So, that should make you feel better...

This post imported from StackExchange Physics at 2015-05-13 18:55 (UTC), posted by SE-user hft
answered Mar 11, 2015 by hft (-10 points) [ no revision ]

Your answer

Please use answers only to (at least partly) answer questions. To comment, discuss, or ask for clarification, leave a comment instead.
To mask links under text, please type your text, highlight it, and click the "link" button. You can then enter your link URL.
Please consult the FAQ for as to how to format your post.
This is the answer box; if you want to write a comment instead, please use the 'add comment' button.
Live preview (may slow down editor)   Preview
Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
Anti-spam verification:
If you are a human please identify the position of the character covered by the symbol $\varnothing$ in the following word:
p$\varnothing$ysicsOverflow
Then drag the red bullet below over the corresponding character of our banner. When you drop it there, the bullet changes to green (on slow internet connections after a few seconds).
Please complete the anti-spam verification




user contributions licensed under cc by-sa 3.0 with attribution required

Your rights
...