What's the probability distribution of a deterministic signal or how to marginalize dynamical systems?

6706 views

In many signal processing calculations, the (prior) probability distribution of the theoretical signal (not the signal + noise) is required.

In random signal theory, this distribution is typically a stochastic process, e.g. a Gaussian or a uniform process.

What do such distributions become in deterministic signal theory?, that is the question.

To make it simple, consider a discrete-time real deterministic signal

$ s\left( {1} \right),s\left( {2} \right),...,s\left( {M} \right) $

For instance, they may be samples from a continuous-time real deterministic signal.

By the standard definition of a discrete-time deterministic dynamical system, there exists:

- a phase space $\Gamma$, e.g. $\Gamma \subset \mathbb{R} {^d}$
- an initial condition $ z\left( 1 \right)\in \Gamma $
- a state-space equation $f:\Gamma \to \Gamma $ having $ z\left( 1 \right)$ in its domain of definition such as $z\left( {m + 1} \right) = f\left[ {z\left( m \right)} \right]$
- an output or observation equation $g:\Gamma \to \mathbb{R}$ such as $s\left( m \right) = g\left[ {z\left( m \right)} \right]$

Hence, by definition we have

$\left[ {s\left( 1 \right),s\left( 2 \right),...,s\left( M \right)} \right] = \left\{ {g\left[ {z\left( 1 \right)} \right],g\left[ {f\left( {z\left( 1 \right)} \right)} \right],...,g\left[ {{f^{M - 1}}\left( {z\left( 1 \right)} \right)} \right]} \right\}$

or, in probabilistic notations

$p\left[ {\left. {s\left( 1 \right),s\left( 2 \right),...,s\left( M \right)} \right|z\left( 1 \right),f,g,\Gamma ,d} \right] = \prod\limits_{m = 1}^M {\delta \left\{ {g\left[ {{f^{m - 1}}\left( {z\left( 1 \right)} \right)} \right] - s\left( m \right)} \right\}} $

Therefore, by total probability and the product rule, the marginal joint prior probability distribution for a discrete-time deterministic signal conditional on phase space $\Gamma$ and its dimension $d$ formally/symbolically writes

$p\left[ {\left. {s\left( 1 \right),s\left( 2 \right),...,s\left( M \right)} \right|\Gamma ,d} \right] = \int\limits_{{\mathbb{R}^\Gamma }} {{\text{D}}g\int\limits_{{\Gamma ^\Gamma }} {{\text{D}}f\int\limits_\Gamma {{{\text{d}}^d}z\left( 1 \right)\prod\limits_{m = 1}^M {\delta \left\{ {g\left[ {{f^{m - 1}}\left( {z\left( 1 \right)} \right)} \right] - s\left( m \right)} \right\}p\left( {z\left( 1 \right),f,g} \right)} } } } $

Should phase space $\Gamma$ and its dimension $d$ be also unknown *a priori*, they should be marginalized as well so that the most general marginal prior probability distribution for a deterministic signal I'm interested in formally/symbolically writes

$p\left[ {s\left( 1 \right),s\left( 2 \right),...,s\left( M \right)} \right] = \sum\limits_{d = 2}^{ + \infty } {\int\limits_{\wp \left( {{\mathbb{R}^d}} \right)} {{\text{D}}\Gamma \int\limits_{{\mathbb{R}^\Gamma }} {{\text{D}}g\int\limits_{{\Gamma ^\Gamma }} {{\text{D}}f\int\limits_\Gamma {{{\text{d}}^d}z\left( 1 \right)\prod\limits_{m = 1}^M {\delta \left\{ {g\left[ {{f^{m - 1}}\left( {z\left( 1 \right)} \right)} \right] - s\left( m \right)} \right\}p\left( {z\left( 1 \right),f,g,\Gamma ,d} \right)} } } } } } $

where ${\wp \left( {{\mathbb{R}^d}} \right)}$ stands for the powerset of ${{\mathbb{R}^d}}$.

Dirac's $\delta$ distributions are certainly welcome to "digest" those very high dimensional integrals. However, we may also be interested in probability distributions like

$p\left[ {s\left( 1 \right),s\left( 2 \right),...,s\left( M \right)} \right] \propto \sum\limits_{d = 2}^{ + \infty } {\int\limits_{\wp \left( {{\mathbb{R}^d}} \right)} {{\text{D}}\Gamma \int\limits_{{\mathbb{R}^\Gamma }} {{\text{D}}g\int\limits_{{\Gamma ^\Gamma }} {{\text{D}}f\int\limits_\Gamma {{{\text{d}}^d}z\left( 1 \right)\int\limits_{{\mathbb{R}^ + }} {{\text{d}}\sigma {\sigma ^{ - M}}{e^{ - \sum\limits_{m = 1}^M {\frac{{{{\left\{ {g\left[ {{f^{m - 1}}\left( {z\left( 1 \right)} \right)} \right] - s\left( m \right)} \right\}}^2}}}{{2{\sigma ^2}}}} }}p\left( {\sigma ,z\left( 1 \right),f,g,\Gamma ,d} \right)} } } } } } $

Please, what can you say about those important probability distributions beyond the fact that they should not be invariant by permutation of the time points, i.e. not De Finetti-exchangeable?

What can you say about such strange looking functional integrals (for the state-space and output equations $f$ and $g$) and even set-theoretic integrals (for phase space $\Gamma$) over sets having cardinal at least ${\beth_2}$? Are they already well-known in some branch of mathematics I do not know yet or are they only abstract nonsense?

More generally, I'd like to learn more about functional integrals in probability theory. Any pointer would be highly appreciated. Thanks.

asked Apr 27, 2016 in Mathematics by Fabrice Pautot (30 points) [ revision history ]
edited Apr 29, 2016 by Fabrice Pautot

I don't understand the goal of your question. The only difference between the determinsitic and the stochastic case is that in the dynamics the coefficient of the noise term is zero. Thus one can use all tools for stochastic time series analysis also in the deterministic case - where only the initial condition is random. (That one cannot easily evaluate certain integrals is a problem one everywhere has....)

commented Apr 28, 2016 by Arnold Neumaier (15,787 points) [ no revision ]

Are you interested in the discrete or the continuous time case?

commented Apr 28, 2016 by Arnold Neumaier (15,787 points) [ no revision ]

Thanks for your comments Arnold.

Regarding comment 2: I'm interested in both the discrete- and continuous-time cases but the discrete-time one is already sufficiently nasty I believe!

Regarding comment 1: suppose the experimental noise is additive. It is common practice to model the sum of the theoretical signal + noise as a stochastic process and to use the tools from stochastic time series analysis/signal processing.

But there are in fact two radically different cases: either the theoretical signal is itself stochastic or it is deterministic. It appears that most of time we are actually assuming, more or less explicitly, the theoretical signal to be itself stochastic.

From this, it also appears that common tools in stochastic time series/signal processing such as Wiener's classical cross-correlation function may not be not suitable for deterministic signals. Please see this question on MO, which is the motivation underlying this question:

http://mathoverflow.net/questions/236527/is-there-a-bayesian-theory-of-deterministic-signal-prequel-and-motivation-for-m?rq=1

I'm gonna ask it on PO as well.

So, my goal was precisely to fix classical cross-correlation functions for deterministic signals.

For this purpose, in theory I just need to assign a suitable joint probability distribution for the samples of my discrete-time deterministic signals in order to determine more suitable time series/signal processing tools for deterministic signals.

But when you write down such probability probability distributions, by marginalizing 1) the initial condition 2) the state-space equation 3) the output/observation equation 4) and the phase space and its dimension, you fall on seemingly monstrous functional integrals that are still unidentified at this time.

Should those probability distributions for deterministic signals be also usual stochastic processes, in particular should they be invariant by permutation of the time points, then classical time series analysis/signal processing tools would work for both stochastic/random and deterministic theoretical signals.

But should they be different from usual stochastic processes because time still plays an essential role in them, while time plays essentiallu no role in (i.i.d. or De Finetti-exchangeable) stochastic processes, then there would exist two different theories of time series analysis/signal processing, one for stochastic theoretical signals that we know well, the other one for deterministic signals waiting to be developed, to the best of my knowledge, if we can ever define and compute those monstrous functional integrals.

commented Apr 28, 2016 by Fabrice Pautot [ no revision ]

Your comment on this question:

To answer, leave an answer instead. Comments are usually for non-answers.
To mask links under text, please type your text, highlight it, and click the "link" button. You can then enter your link URL.
To alert a user, please use the "@" command and remove spaces from the username, example, the user "John Doe" should be pinged as "@JohnDoe", while the user "Johndoe" should be pinged as "@Johndoe". The post author is always automatically pinged (unless you are the post author).
Please consult the FAQ for as to how to format your post.

Live preview (may slow down editor) Preview

Your name to display (optional):

Email me at this address if a comment is added after mine:

Privacy: Your email address will only be used for sending these notifications.

Anti-spam verification:

[captcha placeholder]

Please complete the anti-spam verification

3 Answers

A discrete stochastic process for $x_t$ with a deterministic dynamics $x_{t+1}=f(x_t)$ is specified by the distribution of the initial condition.

Thus one models $x_0$ as a random vector $x_0(\omega)$ with a measure $d\mu$ on the space $\Omega$ over which $\omega$ varies, and defines $x_{t+1}(\omega):=f(x_t)(\omega)$. This specifies all expectations $$\langle f(x_0,\ldots,x_t)\rangle=\int d\mu(\omega)f(x_0(\omega),\ldots,x_t(\omega))$$ and hence the (highly singular) joint probability distribution. Working with the functional integral is in my opinion overkill in this case.

if the determinsitc model equation is not known one generally assumes a parametric form $f(x)=F(\theta,x)$ for it. then all expectations above depend on $\theta$ as well, one one can use experimental or data to estimate in the traditional way $\theta$ from a number of empirical expectations.

On the other hand, in practical estimation, one always assumes the presence of process noise and estimates it together with the noise in the initial conditions, the noise in the observations, and the parameters of the process. The process can be taken to be deterministic if the standard deviation of the process noise is negligible compared with the signal according to some test for negligible covariance parameters. Indeed, this is the way to numerically distinguish deterministic chaotic time series from stochastic ones. In particular, one can use all standard statistical tools for time series.

answered Apr 28, 2016 by Arnold Neumaier (15,787 points) [ revision history ]
edited May 2, 2016 by Arnold Neumaier

Again, thanks for your inputs Arnold.

(@ArnoldNeumaier is not recognized by PO system so that I hope you'll get my comments).

Just like me, you are considering the case where initial condition $x_0$ has some probability distribution either because it is a random variable in the frequentist world or because it is unknown a priori in the Bayesian world.

But you are considering the case where the state-space equation $f$ is known/given a priori, i.e. has a Dirac's $\delta(f-f_0)$ probability distribution.

In many practical experiments $f$ is known/given a priori. For instance if you experiment with Moon's pendulum then $f$ is Duffing's differential equation (in continuous time).

But in many other experiments the state-space equation is not given a priori. This is often the case in nonlinear deterministic physiological signal analysis for instance.

Many times we don't even know the phase space and its dimension.

In this case, the first step of the analysis is precisely to estimate the phase space's dimension in order to reconstruct the dynamics and the attractor using tools such as Takens' theorem and other embedding methods.

I consider the general case: our knowledge and ignorance about the underlying dynamical system is encoded in a joint prior probability distribution

${p\left( {z\left( 1 \right),f,g,\Gamma ,d} \right)}$

You consider only the special case

$p\left( {z\left( 1 \right),f,g,\Gamma ,d} \right) = p\left( {z\left( 1 \right)} \right)\delta \left( {f - {f_0}} \right)\delta \left( {g - {g_0}} \right)\delta \left( {\Gamma - {\Gamma _0}} \right)\delta \left( {d - {d_0}} \right)$

to be compared to the good old default noninformative improper uniform distribution (principle of insufficient reason)

$p\left( {z\left( 1 \right),f,g,\Gamma ,d} \right) \propto 1$

In other words, you consider the conditional joint prior probability distribution

$p\left[ {\left. {s\left( 1 \right),s\left( 2 \right),...,s\left( M \right)} \right|f,g,\Gamma ,d} \right]$

I consider the unconditional, marginal, average joint prior probability distribution

$p\left[ {s\left( 1 \right),s\left( 2 \right),...,s\left( M \right)} \right]$

Because they match with our experiments, it is important to be able to define and compute the unconditional joint prior probability distributions of deterministic signals.

In particular, it is important to determine whether they are De Finetti-exchangeable, i.e. invariant by permutation of the time points.

If they are De Finetti-exchangeable, then the order does not matter.

In particular, the chronological order, that is the time, does not matter. In this case, the classical theory of stochastic processes applies. Because time plays essentially no role in stochastic processes that are just collections of random variables ${{\text{X}}_t}$ indexed by the time (or more sophisticated stuff).

Here we consider "random variables" that are function of the time instead: ${\text{X}}\left( t \right)$.

If $p\left[ {{\text{X}}\left( t \right)} \right]$ is ever not De Finetti-exchangeable then time could or would still play an essential role despite our drastic state of ignorance about our dynamical system. In this case the theory of stochastic processes might not apply to deterministic signals, independently on the SNR.

(of course we could still see the joint marginal prior probability distributions as stochastic processes. We would just have two different kind of stochastic processes, one for random processes and another one for deterministic processes. But stochastic fits more with randomness than determinism.).

The problem is that, when we marginalize the state-space equation, the output equation and the phase space, we fall on integrals that do not look friendly at all...

commented Apr 28, 2016 by Fabrice Pautot [ no revision ]

I added a paragraph to my answer to address this. The parameter $\theta$ there may include information about an unknown dimension of the state space. I see no necessity for introducing all the complexity you talk about. It is irrelevant for the estimation problem.

Moreover, if you don't know the dynamics how can you know that it is a deterministic dynamics? This is why it is wise to include process noise into the model. It usually gives a more parsimonious description still consistent with all data.

(Note that your ping worked, contrary to what you said.)

commented Apr 29, 2016 by Arnold Neumaier (15,787 points) [ no revision ]

@ArnoldNeumaier @ArnoldNeumaier

Thanks again Arnold, I really appreciate the discussion.

Precisely, if we can ever compute the joint prior probability distributions of deterministic signals $p\left[ {{\text{X}}\left( t \right)} \right]$, we could, at least in theory, test the hypothesis

${{\text{H}}_0}:{\text{X}}\left( t \right){\text{ is deterministic}}$

versus the alternative

${{\text{H}}_1}:{{\text{X}}_t}{\text{ is stochastic/random}}$

with our experimental data.

Indeed, suppose the experimental noise ${\xi _t}$ is additive.

Then under ${{\text{H}}_0}$ our model for our experimental signal ${{\text{S}}_t}$ is

${{\text{H}}_0}:{{\text{S}}_t}{\text{ = X}}\left( t \right) + {\xi _t}$

while under ${{\text{H}}_1}$ our model is

${{\text{H}}_1}:{{\text{S}}_t}{\text{ = }}{{\text{X}}_t} + {\xi _t}$

It is easy to get the joint probability distribution of ${{\text{S}}_t}$ from the joint probability distributions of ${{\text{X}}\left( t \right)}$ or ${{\text{X}}_t}$ and ${\xi _t}$ (because the pdf of the sum of two independent r.v.s is the convolution product of both pdfs).

So, if $p\left( {{{\text{X}}_t}} \right)$ is ever different from $p\left[ {{\text{X}}\left( t \right)} \right]$ because, for instance, the former is De Finetti-exchangeable but the later is not, then we could test ${{\text{H}}_0}$ versus ${{\text{H}}_1}$ as usual by computing the corresponding Bayes' factors.

Of course, in some cases, the situation could be more complex: the underlying system could be a stochastic dynamical system:

$z\left( {m + 1} \right) = f\left[ {z\left( m \right)} \right] + {\zeta _m}$

where ${\zeta _m}$ is some discrete-time stochastic/random process.

My starting point was precisely to acknowledge that some well-known statistics such as Wiener's cross-correlation function do not look suitable for deterministic signals and processes.

Indeed, if the theoretical signal is stochastic then it is easy to prove that the cross-correlation function is a sufficient statistics for estimating, for instance, the time delay/lag between two such signals corrupted by additive noises (please see the reference in my extended question Is there a Bayesian theory of deterministic signal?).

But the cross-correlation function is not expected to be a sufficient statistics for estimating the lag between two deterministic signals corrupted by additive noises, because, at a given time point, it is invariant by permutation of the time points so that time is lost.

If we could ever compute the prior probability distributions of deterministic signals $p\left[ {{\text{X}}\left( t \right)} \right]$ then we could, at least in principle, derive the sufficient statistics for the very same problem, which is expected to be quite different from Wiener's cross-correlation function.In oarticular, it is not expected to be De Finetti-exchangeable at a given time point.

commented Apr 29, 2016 by Fabrice Pautot [ no revision ]
moved May 2, 2016 by Dilaton

Deciding what you mentioned at the top of your answer (I don't know why it is an answer in any sense) is usually done as I described at the end of my answer, by testing whether the process variance is small enough. No fancy stuff like what you introduce is needed for that.

commented Apr 29, 2016 by Arnold Neumaier (15,787 points) [ no revision ]

Or should this have been a comment to my answer and not an answer?

commented May 1, 2016 by Arnold Neumaier (15,787 points) [ no revision ]

@ArnoldNeumaier

Yes Arnold, that was just a comment following your answer, not an answer to my own question.

I don't understand how my comment became an answer, I've to be more careful!

I'm preparing a comment following your answer's update and I will post it ASAP.

See you... Fabrice

commented May 2, 2016 by Fabrice Pautot [ no revision ]

@ArnoldNeumaier @ArnoldNeumaier

Hello Arnold,

You are now considering a slightly more general but still highly informative case: you don't know what your state-space equation exactly is, only that it belongs to some parametric family of them.

In this case, the strange integral over all state-space equations is replaced by an usual (multiple) integral over the parameters $\theta$ in the marginal joint prior probability distribution of the signal.

Again, I'm considering the general case, especially the non-informative one: I'd like to know what the mere fact of being deterministic - and nothing else - does imply, if it ever implies anything.

This is a fundamental theoretical question. In particular, I'd like to know whether, despite our ignorance, time/chronological order still plays an essential role or not in the marginal joint prior probability distribution of a deterministic signal.

It is of fundamental interest per se but, as stated above, answering this question could have many practical applications such as testing determinism vs randomness or deriving sufficient statistics for many signal processing problems.

I am surprised and even troubled not to get any sharp answer to such a fundamental question.

Fabrice

commented May 8, 2016 by Fabrice Pautot [ no revision ]

Given a finite amount of indormation only. the mere fact that the dynamics is determinsitic without specifying a parametric law says nothing at all, Any finite set of data can be exactly explained by a smooth deterministic model. (This is an ill-posed interpolation problem, with infinitely many solutions) Typically, the estimated solutions just won't generalize to additional data. This is the reason why in practice one assumes noisy dynamics.

commented May 8, 2016 by Arnold Neumaier (15,787 points) [ no revision ]

@ArnoldNeumaier

Hello Arnold,

Of course, I perfectly agree that any finite set of data can be exactly explained by a smooth deterministic model (and infinitely many of them). This is still true for a countably infinite set of data.

But does this really imply that the mere fact that the dynamics is deterministic without specifying a parametric law says nothing at all?

This is exactly what I'd like to know and the only way I see to answer this question is to compute the marginal, unconditional joint prior probability distribution of a deterministic signal.

Do you see another way? (in my understanding, it might be possible to answer this question without computing the unconditional probability distribution of a deterministic signal. For instance, it might be possible to prove that it is (NOT) De Finetti-exchangeable without computing it explicitly but I don't see how this could be done).

But your point is highly relevant: there are many, so many state-space equations that it is not easy to see how to marginalize them at a first glance. But only the images of a finite (or countably infinite) number $M$ of points under those functions are relevant for our problem in the discrete-time case. For this reason, we may need to consider the sets of all functions having the same given images in order to compute the required functional integral and to marginalize all possible functions/state-space equations $f$...

commented May 9, 2016 by Fabrice Pautot [ no revision ]

@ArnoldNeumaier

By definition, proving or disproving the statement

The mere fact that the dynamics is deterministic without specifying a parametric law says nothing at all.

requires to marginalize at least the state-space equation $f$.

So, I'm desperately looking for some references where state-space equation marginalization is addressed.

Up to now, I've been unable to find anything like this on the Web, hence the question.

Kindest regards,

Fabrice

commented May 9, 2016 by Fabrice Pautot [ no revision ]

Your problem is far too general to be meaningful, and (as I showed) cannot have implications for data arising in practice; hence nobody addresses it. So why do you expect to find references?

Already the parameterized case with many more parameters than data (which is an approximation to your problem) has the same defects but is tractable by my setting. Using functional integrals is heavy overkill.

commented May 10, 2016 by Arnold Neumaier (15,787 points) [ no revision ]

Dear Arnold,

You unfortunately still don't see my point. Let me try again: I'm just asking how to PROVE your OWN statement/claim:

The mere fact that the dynamics is deterministic without specifying a parametric law says nothing at all.

As I explained, this statement/claim would/should translate probabilistically as something like

The marginal, unconditional probability distribution of a real discrete-time deterministic signal of length $M$ is the (improper) uniform distribution over ${\mathbb{R}^M}$. (or another fairly noninformative probability distribution).

Do you agree with this or not?

If you ever have a mathematical proof of this statement, please share it. If you don't, it is just a claim, not mathematics nor physics yet. As explained before, a first step would be to PROVE or DISPROVE that such an unconditional distribution is De Finetti-exchangeable. Again do we have a PROOF? That is the question.

As long as we are not provided with such proofs, we won't know whether or not there exists a (unconditional) theory of deterministic signal in addition to the theory of random signal/stochastic processes we know well.

In particular, we won't know whether classical RANDOM signal processing tools and statistics such a cross-correlation functions are also suitable for deterministic functions and we won't be able to derive such tools and (sufficient) statistics because the starting point, the probability distribution of the signal is simply MISSING.

What do you think about my cross-correlation function and covariance example? (please see my question Is there a Bayesian theory of deterministic signal?).

Do you really think that the well-known covariance between two signals is still meaningful for two deterministic signals (please forget the experimental noise once and for all: it comes at the end of the story, not at the beginning) despite the fact that it is invariant by permutation of the times points/De Finetti-exchangeble so that the chronological order, i.e. the time, is lost???

Can you PROVE that Wiener's cross-correlation function is still a sufficient statistics for estimating the time delay/lag between two deterministic signals corrupted by additive noise as Scargle has PROVED for two random signals having uniform prior probability distribution on ${\mathbb{R}^M}$?

I'm precisely asking those fundamental questions because I fear they haven't been properly addressed up to know: just claims to elude the questions, no proofs.

We need PROOFS and in order to get them we need to become familiar with seemingly new concepts such as state-space equations marginalization.

commented May 11, 2016 by Fabrice Pautot [ no revision ]

@ArnoldNeumaier

Hello again Arnold,

This time my comment is not at the right place??? Sorry for that.

Hope you will read it: we need proofs, not claims.

Wir müssen wissen. Wir werden wissen.

Kindest regards. Fabrice

commented May 11, 2016 by Fabrice Pautot [ no revision ]

@ArnoldNeumaier @ArnoldNeumaier

Hello Arnold. Point by point please:

Your problem is far too general to be meaningful. FALSE: (conditionally on phase space $\Gamma$ and its dimension $d$) Would the state-space equation $f$ and the output equation $g$ be usual "random variables", the required probability distribution would be given by an usual, straightforward multiple integral (that we may be able to compute or not, that's another story.). There is nothing too general here, it is just that we are facing seemingly unusual (functional) integrals;
(as I showed) cannot have implications for data arising in practice. Sorry, I see no proof from your side, only claims. As far as I know, WE DON'T KNOW. Practical example: please, just derive the sufficient statistics for estimating the lag between two deterministic signals corrupted by additive Gaussian noise. Is it Wiener's cross-correlation function or not? If it is, you might be right. If it isn't your are wrong. (generally speaking, it is always risky business to make universal negative Boolean statements (e.g. E.T. does not exist) unless you have a proof) ;
Hence nobody addresses it. So why do you expect to find references? IRRELEVANT because it is based on the false and hypothetical/unproven premises above. On my side, it is of fundamental interest to determine whether somebody has ever computed the probability distribution of a deterministic signal or not. If nobody has ever done this, be sure I will do my best to do it because I definitely want to know whether there exists a theory of deterministic signal or not...
Already the parameterized case with many more parameters than data (which is an approximation to your problem) has the same defects but is tractable by my setting. Using functional integrals is heavy overkill. FALSE: again, no defect here. It seems to me you are mixing two completely different problems: 1) dynamical system identification/estimation from experimental data 2) (theoretical) dynamical system marginalization. I might agree with your point as far as dynamical system identification/estimation is concerned. But I'm concerned with dynamical system marginalization and again there is no defect, nothing too general here, just (not so) straightforward calculations of marginal probability distributions from the corresponding conditional probability distributions. But I agree with you: as a first step, we could parameterize the state-space equations $f$ (and the output equation $g$ as well), compute the corresponding marginal probability distribution of the signal, then take the limit when the number of parameters tends to infinity. Many state-space equations would be missing but if, for instance, the marginal probability distribution of the signal is already De Finetti-exchangeable, the marginal probability distribution over all state-space equations might also be De Finetti-exchangeable.

commented May 12, 2016 by Fabrice Pautot [ no revision ]

wir wissen schon. It is not necessary to have a mathematical formulation
of the most general problem if a special case already gives the answer.
Generically, once the number of parameters in a data estimation problems
exceeds the number of data points - and in practice long before - one
can no longer discriminate between signal and noise. (This is a
generalized version of the Nyquist theorem). If the nonlinearities in
the model are benign enough one can then always fit a deterministic
model to the data.

Thus there is no way to discriminate with a finite amount of data
between a deterministic dynamics and a stochastic one. You can make as
much theory about it as you like, it doesn't change the fact that it
will be useless in practice. If one can't estimate the model one doesn't
need to draw any consequences from it. This settles the case for me, and
everyone practically minded - even though you may find the arguments not
convincing.

By the way, you can model the most general case you are interested by
using my parameterized model with an infinite-dimensional parameter. But
I am not interested in discussing this further - the discussion volume
is already far out of proportion, and without anyone else being
interested. Also, it has nothing to do with physics.

commented May 16, 2016 by Arnold Neumaier (15,787 points) [ no revision ]

If you want to answer practical questions you can always add a tiny
amount of Gaussian process noise and then take in the answer the limit
of vanishing variance.

commented May 16, 2016 by Arnold Neumaier (15,787 points) [ no revision ]

@ArnoldNeumaier

Dear Arnold,

Thank you again for your kind reply. Ok, this is my last comment.

As you said, you are still considering the problem of
estimating/identifying/modelling a dynamical system from (noisy)
experimental data. This is a kind of problems I know quite well since
I'm earning my living modelling and processing nonlinear deterministic
signals, in particular physiological signals from
electroencephalography, electromyography, electrooculography or MRI and
Computed Tomography functional imaging of the brain.

My PO question/problem arises from some practical problems in this area:
quantifying the dependency between two signals/time series. For random
signals having for instance improper uniform distribution, it is easy to
prove (see Scargle's paper) that a sufficient statistics for this
problem is the classical covariance. But for deterministic signals,
that's a completely different story: we have many tools such as
nonlinear dependencies, instantaneous phase synchronization via Hilbert
transform, much entropic stuff, etc. See for instance this thesis from
TU Wien:

http://publik.tuwien.ac.at/files/PubDat_189752.pdf

But as far as know, all of them are adhockeries from the point of view
of Bayesian probability theory: they are not derived from the joint
marginal posterior probability distribution for the current problem. In
particular, for a given problem, there should be only one sufficient
statistics, not dozens of them!

In theory, we just need to compute this joint marginal posterior
probability distribution and marginalize all nuisance parameters in
order to derive our sufficient statistics. See Scargle's paper for an
important, illuminating example. But in order to that, we need to supply
the prior probability distribution of our signal(s).

Heres is the main problem and the purpose of my PO question: again, in
theory, we know how to compute the (marginal) prior probability
distribution of our deterministic signal: just marginalize all nuisance
"parameters" that are (at most) the initial condition, the state-space
equation, the output equation, the phase space and its dimension for
dynamical systems.

But in "practice" it is not yet clear, at least to my poor
understanding, if we can well define (noninformative) joint prior
probability distributions over those parameters because some of them are
not usual random variables but functions. Subsequently, it is even less
clear how to compute the required marginal prior probability
distribution of the signal(s) from those hypothetical joint prior
probability distributions.

So, starting from some practical problems for which no sufficient
satistics seems to be known to the best of my knowledge, we finally fall
on a purely theoretical and mathematical one (which has nothing to do
with dynamical system identification/estimation or modelling), which is
to my mind of fundamental interest because its solution could give birth
to a new theory of deterministic signal (processing) if it ever appears
that (noninformative) prior probability distributions of deterministic
signals are different from usual stochastic processes, for instance not
invariant by permutation of the time points/De Finetti-exchangeable
because the time, in particular the time arrow, still plays an essential
role in them. Contrary to your point of view, at this point I am unable
to see any reason why those prior probability distributions should
necessarily match usual usual, e.g. i.i.d. or De Finetti-exchangeable
stochastic processes. On the contrary, some of us (at least myself!)
could/would conjecture that they are NOT De Finetti-exchangeable because
they are not yet ready to abandon the time (arrow) within deterministic
dynamical system theory.

That was my last chance to explain my problem.

Finally, I used to believe, together with Henri Poincaré, that
time (arrow), dynamical system theory and Bayesian probability theory
were all part of (mathematical) physics. Please see his Calcul des
Probabilités:

http://visualiseur.bnf.fr/Visualiseur?Destination=Gallica&O=NUMM-29064

Time to say goodbye to you Arnold. Thanks for the many comments!!!

Hope I will continue the discussion with other good fellows.

commented May 16, 2016 by UnknownToSE (505 points) [ no revision ]

Your reference is over 100 years old. Since Kolmogorov, probability
theory is a branch of mathemtics, no longer of physics.

Where do you find Bayesian probability theory in the PACS
classification, which is closest to an official definition of what
belongs to physics?

https://www.aip.org/pacs/pacs2010/individuals/pacs2010_regular_edition/index.html

commented May 16, 2016 by Arnold Neumaier (15,787 points) [ no revision ]

@ArnoldNeumaier

I allow myself to reply Arnold please. To be very short, I can tell you
that Poincaré is right, I've been studying this particular point
over the last 20 years. Hint/starting point:

Mister Poincaré, you are wrong. Your works prove that some people
can think only nonsense.

Vladimir Ilitch Oulianov, better known as Lenin, Materialism and
empiriocriticism, 1908.

Lenin --> Stalin --> Kolmogorov (Stalin prize, 1941)

Kolmogorov was definitely not allowed to follow Poincaré (= one
way ticket to the Gulag) but, of course, he would have followed.

Read the Grunbegriffe one more time very carefully then, right after,
one of his last paper, Foundations of probability theory, 1983...

Vladimir Arnold knows about this... Can develop this as much as you
want. My question does not come from out of space...

Fabrice

commented May 16, 2016 by UnknownToSE (505 points) [ no revision ]

@FabricePautot

I have just reinstalled some comments that got lost due to our recent techical problems.
Maybe you would like to consider registering an account, such that I can correctly assign all of your contributions to it?

commented May 16, 2016 by Dilaton.admin (0 points) [ no revision ]

@Dilaton.admin

Yes, definitely, I need to register.

I'm happy to see that our discussion with Arnold has finally been restored.

One remark please: French accents have been corrupted due to those recent technical difficulties: for instance Poincaré now displays as Poincar&eacute.

Kindest regards, Fabrice.

commented May 16, 2016 by Fabrice Pautot [ no revision ]

@FabricePautot

Ok, I have just created a thread to claim unregistered contributions

http://physicsoverflow.org/36103/claims-of-unregistered-contributions

Maybe you can answer it as soon as you have registered?

After your contributions are assigned to your registered account, you will have full control over them to edit or correct them etc ...

commented May 16, 2016 by Dilaton (6,240 points) [ no revision ]

Your comment on this answer:

Live preview (may slow down editor) Preview

Your name to display (optional):

Email me at this address if a comment is added after mine:

Privacy: Your email address will only be used for sending these notifications.

Anti-spam verification:

[captcha placeholder]

Please complete the anti-spam verification

+ 1 like - 0 dislike

Conditioning by event D = {'System deterministic'} you are not restricting the search space. There are infinite non-parametrized functions that will agree with it. You will find such a deterministic function when $\mid \Omega \mid$= 1 of the chosen probability space. Formulated as an optimization problem, D states, in the best case, that such a minimum exists.

The question closely related to Kolmogorov complexity, algorithmic information theory and machine learning.

answered Oct 14, 2018 by Vadim [ no revision ]

Your comment on this answer:

Live preview (may slow down editor) Preview

Your name to display (optional):

Email me at this address if a comment is added after mine:

Privacy: Your email address will only be used for sending these notifications.

Anti-spam verification:

[captcha placeholder]

Please complete the anti-spam verification

+ 0 like - 0 dislike

Hi, I am also keen to know how to find the distribution of the histogram of excess power generated from a deterministic signal normalized by N sigma^2. Suppose the signal form is Gaussian, then the histogram of the excess power initiates with peaks in heights of the bins and lowers down and again rises in the extreme end.

My purpose is to find a distribution for such a histogram.

Thanks

answered Jun 13, 2020 by Pi [ no revision ]

Your comment on this answer:

Live preview (may slow down editor) Preview

Your name to display (optional):

Email me at this address if a comment is added after mine:

Privacy: Your email address will only be used for sending these notifications.

Anti-spam verification:

[captcha placeholder]

Please complete the anti-spam verification

Your answer

Please use answers only to (at least partly) answer questions. To comment, discuss, or ask for clarification, leave a comment instead.
To mask links under text, please type your text, highlight it, and click the "link" button. You can then enter your link URL.
Please consult the FAQ for as to how to format your post.
This is the answer box; if you want to write a comment instead, please use the 'add comment' button.

Live preview (may slow down editor) Preview

Your name to display (optional):

Email me at this address if my answer is selected or commented on:

Privacy: Your email address will only be used for sending these notifications.

Anti-spam verification:

If you are a human please identify the position of the character covered by the symbol $\varnothing$ in the following word:
p$\varnothing$ysicsOverflow
Then drag the red bullet below over the corresponding character of our banner. When you drop it there, the bullet changes to green (on slow internet connections after a few seconds).

Please complete the anti-spam verification

News

Tools for paper authors

Tools for SE users

Public \(\beta\) tools

Most popular tags

Site Statistics

What's the probability distribution of a deterministic signal or how to marginalize dynamical systems?

Your comment on this question:

Live Preview

Preview

3 Answers

Your comment on this answer:

Live Preview

Preview

Your comment on this answer:

Live Preview

Preview

Your comment on this answer:

Live Preview

Preview

Your answer

Live Preview

Preview

News

Tools for paper authors

Tools for SE users

Public \(\beta\) tools

Most popular tags

Related questions

Site Statistics

What's the probability distribution of a deterministic signal or how to marginalize dynamical systems?

Your comment on this question:

Live Preview

Preview

3 Answers

Your comment on this answer:

Live Preview

Preview

Your comment on this answer:

Live Preview

Preview

Your comment on this answer:

Live Preview

Preview

Your answer

Live Preview

Preview