{{Short description|Statistical proof by contradiction technique}} '''Surrogate data testing'''<ref name=Theiler1> {{cite journal |author1=J. Theiler |author2=S. Eubank |author3=A. Longtin |author4=B. Galdrikian |author5=J. Doyne Farmer |title=Testing for nonlinearity in time series: the method of surrogate data |journal=Physica D |volume=58 |issue=1–4 |pages=77–94 |year=1992 |doi=10.1016/0167-2789(92)90102-S |bibcode=1992PhyD...58...77T|url=https://digital.library.unt.edu/ark:/67531/metadc1094730/m2/1/high_res_d/6026813.pdf}}</ref> (or the ''method of surrogate data'') is a statistical proof by contradiction technique similar to permutation tests<ref>Moore, Jason H. "Bootstrapping, permutation testing and the method of surrogate data." Physics in Medicine & Biology 44.6 (1999): L11</ref> and parametric bootstrapping. It is used to detect non-linearity in a time series.<ref name=Galka>{{cite book |author=Andreas Galka |title=Topics in Nonlinear Time Series Analysis: with Implications for EEG Analysis |location=River Edge, N.J. |publisher=World Scientific |year=2000 |pages=222–223 |isbn=978-981-02-4148-3}}</ref> The technique involves specifying a null hypothesis <math>H_0</math> describing a linear process and then generating several surrogate data sets according to <math>H_0</math> using Monte Carlo methods. A discriminating statistic is then calculated for the original time series and all the surrogate set. If the value of the statistic is significantly different for the original series than for the surrogate set, the null hypothesis is rejected and non-linearity assumed.<ref name=Galka />
The particular surrogate data testing method to be used is directly related to the null hypothesis. Usually this is similar to the following: ''The data is a realization of a stationary linear system, whose output has been possibly measured by a monotonically increasing possibly nonlinear (but static) function''.<ref name=Theiler1 /> Here ''linear'' means that each value is linearly dependent on past values or on present and past values of some independent identically distributed (i.i.d.) process, usually also Gaussian. This is equivalent to saying that the process is ARMA type. In case of fluxes (continuous mappings), linearity of system means that it can be expressed by a linear differential equation. In this hypothesis, the ''static'' measurement function is one which depends only on the present value of its argument, not on past ones.
== Methods ==
Many algorithms to generate surrogate data have been proposed. They are usually classified in two groups:<ref name=Theiler2> {{cite journal |author1=J. Theiler |author2=D. Prichard |title=Constrained-realization Monte-Carlo method for hypothesis testing |journal=Physica D |volume=94 |issue=4 |year=1996 |doi=10.1016/0167-2789(96)00050-4 |pages=221–235 |arxiv=comp-gas/9603001|bibcode=1996PhyD...94..221T|s2cid=12568769 }}</ref> * ''Typical realizations'': data series are generated as outputs of a well-fitted model to the original data. * ''Constrained realizations'': data series are created directly from original data, generally by some suitable transformation of it.
The last surrogate data methods do not depend on a particular model, nor on any parameters, thus they are non-parametric methods. These surrogate data methods are usually based on preserving the linear structure of the original series (for instance, by preserving the autocorrelation function, or equivalently the periodogram, an estimate of the sample spectrum).<ref>{{cite journal |title=Testing for nonlinearity in high-dimensional time series from continuous dynamics |author1=A. Galka |author2=T. Ozaki |journal=Physica D |volume=158 |issue=1–4 |year=2001 |pages=32–44 |doi=10.1016/s0167-2789(01)00318-9|bibcode=2001PhyD..158...32G |citeseerx=10.1.1.379.7641 }}</ref> Among constrained realizations methods, the most widely used (and thus could be called the ''classical methods'') are:
# Algorithm 0, or RS (for ''Random Shuffle''):<ref name=Theiler1 /><ref name=Scheinkman> {{cite journal |author1=J.A. Scheinkman |author2=B. LeBaron |title=Nonlinear Dynamics and Stock Returns |journal=The Journal of Business |volume=62 |issue=3 |page=311 |year=1989 |url=https://ideas.repec.org/a/ucp/jnlbus/v62y1989i3p311-37.html |doi=10.1086/296465 |url-access=subscription }}</ref> New data are created simply by random permutations of the original series. This concept is also used in permutation tests. The permutations guarantee the same amplitude distribution as the original series, but destroy any temporal correlation that may have been in the original data. This method is associated to the null hypothesis of the data being uncorrelated i.i.d noise (possibly Gaussian and measured by a static nonlinear function). # Algorithm 1, or RP (for ''Random Phases''; also known as FT, for Fourier Transform):<ref name=Theiler1 /><ref name=Osborne> {{cite journal |author1=A.R. Osborne |author2=A.D. Kirwan Jr. |author3=A. Provenzale |author4=L. Bergamasco |title=A search for chaotic behavior in large and mesoscale motions in the Pacific Ocean |journal=Physica D |volume=23 |issue=1–3 |pages=75–83 |year=1986 |doi=10.1016/0167-2789(86)90113-2 |bibcode=1986PhyD...23...75O}}</ref> In order to preserve the linear correlation (the periodogram) of the series, surrogate data are created by the inverse Fourier Transform of the modules of Fourier Transform of the original data with new (uniformly random) phases. If the surrogates must be real, the Fourier phases must be antisymmetric with respect to the central value of data. # Algorithm 2, or AAFT (for ''Amplitude Adjusted Fourier Transform''):<ref name=Theiler1 /><ref name=Theiler2 /> This method has approximately the advantages of the two previous ones: it tries to preserve both the linear structure and the amplitude distribution. This method consists of these steps: #* Scaling the data to a Gaussian distribution (''Gaussianization''). #* Performing a RP transformation of the new data. #* Finally doing a transformation inverse of the first one (''de-Gaussianization''). #:The drawback of this method is precisely that the last step changes somewhat the linear structure. # Iterative algorithm 2, or IAAFT (for ''Iterative Amplitude Adjusted Fourier Transform''):<ref name=Schreiber1> {{cite journal |author1=T. Schreiber |author2=A. Schmitz |title=Improved Surrogate Data for Nonlinearity Tests |journal=Phys. Rev. Lett. |volume=77 |issue=4 |year=1996 |doi=10.1103/PhysRevLett.77.635 |pmid=10062864 |pages=635–638 |arxiv=chao-dyn/9909041|bibcode=1996PhRvL..77..635S|s2cid=13193081 }}</ref> This algorithm is an iterative version of AAFT. The steps are repeated until the autocorrelation function is sufficiently similar to the original, or until there is no change in the amplitudes.
Many other surrogate data methods have been proposed, some based on optimizations to achieve an autocorrelation close to the original one,<ref name=Schreiber2> {{cite journal |author1=T. Schreiber |author2=A. Schmitz |title=Surrogate time series |journal=Physica D |volume=142 |issue=3–4 |pages=346–382 |year=2000 |doi=10.1016/S0167-2789(00)00043-9| arxiv=chao-dyn/9909037 |bibcode=2000PhyD..142..346S|s2cid=13889229 }}</ref><ref name=Schreiber3> {{cite journal |author=T. Schreiber |journal=Phys. Rev. Lett. |volume=80 |issue=4 |year=1998 |doi=10.1103/PhysRevLett.80.2105 |title=Constrained Randomization of Time Series Data |pages=2105–2108 |bibcode=1998PhRvL..80.2105S |arxiv=chao-dyn/9909042 |s2cid=42976448 }}</ref><ref name=Engbert> {{cite journal |author=R. Engbert |title=Testing for nonlinearity: the role of surrogate data |journal=Chaos, Solitons & Fractals |volume=13 |issue=1 |year=2002 |doi=10.1016/S0960-0779(00)00236-8 |pages=79–84 |bibcode=2002CSF....13...79E }}</ref> some based on wavelet transform<ref name=Breakspear> {{cite journal |author1=M. Breakspear |author2=M. Brammer |author3=P.A. Robinson |title=Construction of multivariate surrogate sets from nonlinear data using the wavelet transform |journal=Physica D |volume=182 |issue=1 |year=2003 |doi=10.1016/S0167-2789(03)00136-2 |pages=1–22 |bibcode=2003PhyD..182....1B}}</ref><ref name=Keylock1> {{cite journal |author=C.J. Keylock |title=Constrained surrogate time series with preservation of the mean and variance structure |journal=Phys. Rev. E |volume=73 |issue=3 |article-number=036707 |year=2006 |doi=10.1103/PhysRevE.73.036707 |pmid=16605698 |bibcode=2006PhRvE..73c6707K }}</ref><ref name=Keylock2> {{cite journal |author=C.J. Keylock |title=A wavelet-based method for surrogate data generation |journal=Physica D |volume=225 |issue=2 |year=2007 |doi=10.1016/j.physd.2006.10.012 |pages=219–228 |bibcode=2007PhyD..225..219K }}</ref> and some capable of dealing with some types of non-stationary data.<ref>{{Cite journal|author1=T. Nakamura|author2=M. Small|year=2005|title=Small-shuffle surrogate data: Testing for dynamics in fluctuating data with trends|journal=Phys. Rev. E|volume=72|issue=5|article-number=056216|doi=10.1103/PhysRevE.72.056216|pmid=16383736|bibcode=2005PhRvE..72e6216N|hdl=10397/4826|hdl-access=free}}</ref><ref name=Nakamura> {{cite journal |author1=T. Nakamura |author2=M. Small |author3=Y. Hirata |title=Testing for nonlinearity in irregular fluctuations with long-term trends |journal=Phys. Rev. E |volume=74 |issue=2 |article-number=026205 |year=2006 |doi=10.1103/PhysRevE.74.026205 |pmid=17025523 |bibcode=2006PhRvE..74b6205N|hdl=10397/7633 |hdl-access=free}}</ref><ref name=Lucio> {{cite journal |author1=J.H. Lucio |author2=R. Valdés |author3=L.R. Rodríguez |title=Improvements to surrogate data methods for nonstationary time series |journal=Phys. Rev. E |volume=85 |issue=5 |article-number=056202 |year=2012 |doi=10.1103/PhysRevE.85.056202 |pmid=23004838 |bibcode=2012PhRvE..85e6202L}}</ref>
The above mentioned techniques are called linear surrogate methods, because they are based on a linear process and address a linear null hypothesis.<ref name="Schreiber2" /> Broadly speaking, these methods are useful for data showing irregular fluctuations (short-term variabilities) and data with such a behaviour abound in the real world. However, we often observe data with obvious periodicity, for example, annual sunspot numbers, electrocardiogram (ECG) and so on. Time series exhibiting strong periodicities are clearly not consistent with the linear null hypotheses. To tackle this case, some algorithms and null hypotheses have been proposed.<ref>{{Cite journal|last=J. Theiler|year=1995|title=On the evidence for low-dimensional chaos in an epileptic electroencephalogram|journal=Physics Letters A|volume=196|issue=5–6|pages=335–341|doi=10.1016/0375-9601(94)00856-K|bibcode=1995PhLA..196..335T}}</ref><ref>{{Cite journal|last1=M. Small|last2=D. Yu|last3=R. G. Harrison|year=2001|title=Surrogate test for pseudoperiodic time series data|journal=Phys. Rev. Lett.|volume=87|issue=18|article-number=188101|doi=10.1103/PhysRevLett.87.188101|bibcode=2001PhRvL..87r8101S|hdl=10397/4856|hdl-access=free}}</ref><ref>{{Cite journal|last1=X. Luo|last2=T. Nakamura|last3=M. Small|year=2005|title=Surrogate test to distinguish between chaotic and pseudoperiodic time series|journal=Phys. Rev. E|volume=71|issue=2|article-number=026230|doi=10.1103/PhysRevE.71.026230|pmid=15783410|arxiv=nlin/0404054|bibcode=2005PhRvE..71b6230L|hdl=10397/4828|s2cid=35512941}}</ref>
== See also ==
* Resampling (statistics) * Permutation test
== References == {{reflist|2}}
Category:Nonlinear time series analysis Category:Statistical tests