# Remez algorithm

> Mediated Wiki article. Canonical URL: https://mediated.wiki/source/Remez_algorithm
> Markdown URL: https://mediated.wiki/source/Remez_algorithm.md
> Source: https://en.wikipedia.org/wiki/Remez_algorithm
> Source revision: 1319373147
> License: Creative Commons Attribution-ShareAlike 4.0 International (https://creativecommons.org/licenses/by-sa/4.0/)

{{Short description|Algorithm to approximate functions}}
The '''Remez algorithm''' or '''Remez exchange algorithm''', published by [Evgeny Yakovlevich Remez](/source/Evgeny_Yakovlevich_Remez) in 1934, is an iterative algorithm used to find simple approximations to functions, specifically, approximations by functions in a [Chebyshev space](/source/Chebyshev_space) that are the best in the [uniform norm](/source/uniform_norm) ''L''<sub>∞</sub> sense.<ref>{{cite journal |author-link=Evgeny Yakovlevich Remez |first=E. Ya. |last=Remez |title=Sur la détermination des polynômes d'approximation de degré donnée |journal=Comm. Soc. Math. Kharkov |volume=10 |pages=41 |date=1934 }}<br/>{{cite journal |author-mask=1 |first=E. |last=Remes (Remez) |title=Sur un procédé convergent d'approximations successives pour déterminer les polynômes d'approximation |journal=Compt. Rend. Acad. Sci. |volume=198 |pages=2063–5 |language=fr |date=1934 |url=https://gallica.bnf.fr/ark:/12148/bpt6k31506/f2063.item}}<br/>{{cite journal |author-mask=1 |first=E. |last=Remes (Remez) |title=Sur le calcul effectif des polynomes d'approximation de Tschebyschef |journal=Compt. Rend. Acad. Sci. |volume=199 |issue= |pages=337–340 |language=fr |date=1934 |url=https://gallica.bnf.fr/ark:/12148/bpt6k3151h/f337.item}}</ref> It is sometimes referred to as '''Remes algorithm''' or '''Reme algorithm'''.<ref>{{Cite journal |last=Chiang |first=Yi-Ling F. |date=November 1988 |title=A Modified Remes Algorithm |url=https://epubs.siam.org/doi/10.1137/0909072 |journal=SIAM Journal on Scientific and Statistical Computing |volume=9 |issue=6 |pages=1058–1072 |doi=10.1137/0909072 |issn=0196-5204|url-access=subscription }}</ref>

A typical example of a Chebyshev space is the subspace of [Chebyshev polynomials](/source/Chebyshev_polynomials) of order ''n'' in the [space](/source/Vector_space) of real [continuous function](/source/continuous_function)s on an [interval](/source/interval_(mathematics)), ''C''[''a'', ''b''].  The polynomial of best approximation within a given subspace is defined to be the one that minimizes the maximum [absolute difference](/source/absolute_difference) between the polynomial and the function. In this case, the form of the solution is precised by the [equioscillation theorem](/source/equioscillation_theorem).

==Procedure==
The Remez algorithm starts with the function <math>f</math> to be approximated and a set <math>X</math> of <math>n + 2</math> sample points <math> x_1, x_2, ...,x_{n+2}</math> in the approximation interval, usually the extrema of Chebyshev polynomial linearly mapped to the interval. The steps are:

* Solve the linear system of equations
:<math> b_0 + b_1 x_i+ ... +b_n x_i ^ n + (-1)^ i E = f(x_i) </math> (where <math> i=1, 2, ... n+2 </math>),
:for the unknowns <math>b_0, b_1...b_n</math> and ''E''.
* Use the <math> b_i </math> as coefficients to form a polynomial <math>P_n</math>.
* Find the set <math>M</math> of points of local maximum error <math>|P_n(x) - f(x)| </math>.
* If the errors at every <math> m \in M </math> are of equal magnitude and alternate in sign, then <math>P_n</math> is the minimax approximation polynomial. If not, replace <math>X</math> with <math>M</math> and repeat the steps above.

The result is called the polynomial of best approximation or the [minimax approximation algorithm](/source/minimax_approximation_algorithm).

A review of technicalities in implementing the Remez algorithm is given by W. Fraser.<ref>{{cite journal |doi=10.1145/321281.321282 |first=W. |last=Fraser |title=A Survey of Methods of Computing Minimax and Near-Minimax Polynomial Approximations for Functions of a Single Independent Variable |journal=J. ACM |volume=12 |pages=295–314 |year=1965 |issue=3 |s2cid=2736060 |doi-access=free }}</ref>

===Choice of initialization===
The [Chebyshev nodes](/source/Chebyshev_nodes) are a common choice for the initial approximation because of their role in the theory of [polynomial interpolation](/source/polynomial_interpolation). For the initialization of the optimization problem for function ''f'' by the Lagrange interpolant ''L''<sub>n</sub>(''f''), it can be shown that this initial approximation is bounded by

:<math>\lVert f - L_n(f)\rVert_\infty \le (1 + \lVert L_n\rVert_\infty) \inf_{p \in P_n} \lVert f - p\rVert</math>

with the norm or [Lebesgue constant](/source/Lebesgue_constant) of the Lagrange interpolation operator ''L''<sub>''n''</sub> of the nodes (''t''<sub>1</sub>, ..., ''t''<sub>''n''&nbsp;+&nbsp;1</sub>) being

:<math>\lVert L_n\rVert_\infty = \overline{\Lambda}_n(T) = \max_{-1 \le x \le 1} \lambda_n(T; x),</math>

''T'' being the zeros of the Chebyshev polynomials, and the Lebesgue functions being

:<math>\lambda_n(T; x) = \sum_{j = 1}^{n + 1} \left| l_j(x) \right|, \quad l_j(x) = \prod_{\stackrel{i = 1}{i \ne j}}^{n + 1} \frac{(x - t_i)}{(t_j - t_i)}.</math>

Theodore A. Kilgore,<ref>{{cite journal |doi=10.1016/0021-9045(78)90013-8 |first=T. A. |last=Kilgore |title=A characterization of the Lagrange interpolating projection with minimal Tchebycheff norm |journal=J. Approx. Theory |volume=24 |pages=273–288 |year=1978 |issue=4 |doi-access= }}</ref> Carl de Boor, and Allan Pinkus<ref>{{cite journal |doi=10.1016/0021-9045(78)90014-X |first1=C. |last1=de Boor |first2=A. |last2=Pinkus |title=Proof of the conjectures of Bernstein and Erdös concerning the optimal nodes for polynomial interpolation |journal=[Journal of Approximation Theory](/source/Journal_of_Approximation_Theory) |volume=24 |pages=289–303 |year=1978 |issue=4 |doi-access=free }}</ref> proved that there exists a unique ''t''<sub>''i''</sub> for each ''L''<sub>''n''</sub>, although not known explicitly for (ordinary) polynomials. Similarly, <math>\underline{\Lambda}_n(T) = \min_{-1 \le x \le 1} \lambda_n(T; x)</math>, and the optimality of a choice of nodes can be expressed as <math>\overline{\Lambda}_n - \underline{\Lambda}_n \ge 0.</math>

For Chebyshev nodes, which provides a suboptimal, but analytically explicit choice, the asymptotic behavior is known as<ref>{{cite journal |first1=F. W. |last1=Luttmann |first2=T. J. |last2=Rivlin |title=Some numerical experiments in the theory of polynomial interpolation |journal=IBM J. Res. Dev. |volume=9 |pages=187–191 |year=1965 |issue=3 |doi= 10.1147/rd.93.0187}}</ref>

:<math>\overline{\Lambda}_n(T) = \frac{2}{\pi} \log(n + 1) + \frac{2}{\pi}\left(\gamma + \log\frac{8}{\pi}\right) + \alpha_{n + 1}</math>

({{math|''γ''}} being the [Euler–Mascheroni constant](/source/Euler%E2%80%93Mascheroni_constant)) with

:<math>0 < \alpha_n < \frac{\pi}{72 n^2}</math> for <math>n \ge 1,</math>

and upper bound<ref>{{cite book |first=T.J. |last=Rivlin |chapter=The lebesgue constants for polynomial interpolation |chapter-url=https://link.springer.com/chapter/10.1007/BFb0063594 |doi=10.1007/BFb0063594 |series=Lecture Notes in Mathematics |volume=399 |editor-last=Garnir |editor-first=H.G. |editor2-last=Unni |editor2-first=K.R. |editor3-last=Williamson |editor3-first=J.H. |title=Functional Analysis and its Applications |publisher=Springer |date=1974 |isbn=978-3-540-37827-3 |pages=422–437 }}</ref>

:<math>\overline{\Lambda}_n(T) \le \frac{2}{\pi} \log(n + 1) + 1</math>

Lev Brutman<ref>{{cite journal |doi=10.1137/0715046 |first=L. |last=Brutman |title=On the Lebesgue Function for Polynomial Interpolation |journal=SIAM J. Numer. Anal. |volume=15 |pages=694–704 |year=1978 |issue=4 |bibcode=1978SJNA...15..694B }}</ref> obtained the bound for <math>n \ge 3</math>, and <math>\hat{T}</math> being the zeros of the expanded Chebyshev polynomials:

:<math>\overline{\Lambda}_n(\hat{T}) - \underline{\Lambda}_n(\hat{T}) < \overline{\Lambda}_3 - \frac{1}{6} \cot \frac{\pi}{8} + \frac{\pi}{64} \frac{1}{\sin^2(3\pi/16)} - \frac{2}{\pi}(\gamma - \log\pi)\approx 0.201.</math>

Rüdiger Günttner<ref>{{cite journal |doi=10.1137/0717043 |first=R. |last=Günttner |title=Evaluation of Lebesgue Constants |journal=SIAM J. Numer. Anal. |volume=17 |pages=512–520 |year=1980 |issue=4 |bibcode=1980SJNA...17..512G }}</ref> obtained from a sharper estimate for <math>n \ge 40</math>

:<math>\overline{\Lambda}_n(\hat{T}) - \underline{\Lambda}_n(\hat{T}) < 0.0196.</math>

==Detailed discussion==
This section provides more information on the steps outlined above. In this section, the index ''i'' runs from 0 to ''n''+1.

'''Step 1:''' Given <math>x_0, x_1, ... x_{n+1}</math>, solve the linear system of ''n''+2 equations
:<math> b_0 + b_1 x_i+ ... +b_n x_i ^ n + (-1) ^ i E = f(x_i) </math> (where <math> i=0, 1, ... n+1 </math>),
:for the unknowns <math>b_0, b_1, ...b_n</math> and ''E''.

It should be clear that <math>(-1)^i E</math> in this equation makes sense only if the nodes <math>x_0, ...,x_{n+1}</math> are ''ordered'', either strictly increasing or strictly decreasing. Then this linear system has a unique solution.  (As is well known, not every linear system has a solution.) Also, the solution can be obtained with only <math>O(n^2)</math> arithmetic operations while a standard solver from the library would take <math>O(n^3)</math> operations.  Here is the simple proof:

Compute the standard ''n''-th degree interpolant <math>p_1(x)</math> to <math>f(x)</math> at the first ''n''+1 nodes and also the standard ''n''-th degree interpolant
<math>p_2(x)</math> to the ordinates <math>(-1)^i</math>
:<math>p_1(x_i) = f(x_i), p_2(x_i) = (-1)^i, i = 0, ..., n.</math>
To this end, use each time [Newton's interpolation formula](/source/Newton_polynomial) with the [divided differences](/source/divided_differences) of order <math>0, ...,n</math> and <math>O(n^2)</math> arithmetic operations.

The polynomial <math>p_2(x)</math> has its ''i''-th zero between <math>x_{i-1}</math> and <math>x_i,\ i=1, ...,n</math>, and thus no further zeroes between <math>x_n</math> and <math>x_{n+1}</math>: <math>p_2(x_n)</math> and <math>p_2(x_{n+1})</math> have the same sign <math>(-1)^n</math>.

The linear combination
<math>p(x) := p_1 (x) - p_2(x)\!\cdot\!E</math> is also a polynomial of degree ''n'' and
:<math>p(x_i) = p_1(x_i) - p_2(x_i)\!\cdot\! E \ = \ f(x_i) - (-1)^i E,\ \ \ \  i =0, \ldots, n.</math>
This is the same as the equation above for <math>i = 0, ... ,n</math> and for any choice of ''E''.
The same equation for ''i'' = ''n''+1 is
:<math>p(x_{n+1}) \ = \ p_1(x_{n+1}) - p_2(x_{n+1})\!\cdot\!E \ = \ f(x_{n+1}) - (-1)^{n+1} E</math> and needs special reasoning:  solved for the variable ''E'', it is the ''definition'' of ''E'':
:<math>E \ := \ \frac{p_1(x_{n+1}) - f(x_{n+1})}{p_2(x_{n+1}) + (-1)^n}.</math>
As mentioned above, the two terms in the denominator have same sign:
''E'' and thus <math>p(x) \equiv b_0 + b_1x + \ldots + b_nx^n</math> are always well-defined.

The error at the given ''n''+2 ordered nodes is positive and negative in turn because
:<math>p(x_i) - f(x_i) \ = \ -(-1)^i E,\ \ i = 0, ... , n\!+\!1. </math>

The [equioscillation theorem](/source/equioscillation_theorem) states that under this condition no polynomial of degree ''n'' exists with error less than ''E''. Indeed, if such a polynomial existed, call it <math>\tilde p(x)</math>, then the difference
<math>p(x)-\tilde p(x) = (p(x) - f(x)) - (\tilde p(x) - f(x))</math> would still be positive/negative at the ''n''+2 nodes <math>x_i</math> and therefore have at least ''n''+1 zeros which is impossible for a polynomial of degree ''n''.
Thus, this ''E'' is a lower bound for the minimum error which can be achieved with polynomials of degree ''n''.

'''Step 2''' changes the notation from
<math>b_0 + b_1x + ... + b_nx^n</math> to <math>p(x)</math>.

'''Step 3''' improves upon the input nodes <math>x_0, ..., x_{n+1}</math> and their errors <math>\pm E</math> as follows.

In each P-region, the current node <math>x_i</math> is replaced with the local maximizer <math>\bar{x}_i</math> and in each N-region <math>x_i</math> is replaced with the local minimizer.  (Expect <math>\bar{x}_0</math> at ''A'', the <math>\bar {x}_i</math> near <math>x_i</math>, and <math>\bar{x}_{n+1}</math> at ''B''.) No high precision is required here,
the standard ''line search'' with a couple of ''quadratic fits''  should suffice. (See <ref>{{cite book |last1=Luenberger |first1=D.G. |last2=Ye |first2=Y. |chapter=Basic Descent Methods |chapter-url=https://link.springer.com/chapter/10.1007/978-0-387-74503-9_8 |title=Linear and Nonlinear Programming |publisher=Springer |edition=3rd |series=International Series in Operations Research & Management Science |volume=116 |date=2008 |isbn=978-0-387-74503-9 |pages=215–262 |doi=10.1007/978-0-387-74503-9_8}}</ref>)

Let <math>z_i := p(\bar{x}_i) - f(\bar{x}_i)</math>. Each amplitude <math>|z_i|</math> is greater than or equal to ''E''. The Theorem of ''de La Vallée Poussin'' and its proof also
apply to <math>z_0, ... ,z_{n+1}</math> with <math>\min\{|z_i|\} \geq E</math> as the new
lower bound for the best error possible with polynomials of degree ''n''.

Moreover, <math>\max\{|z_i|\}</math> comes in handy as an obvious upper bound for that best possible error.

'''Step 4:''' With <math>\min\,\{|z_i|\}</math> and <math>\max\,\{|z_i|\}</math> as lower and upper bound for the best possible [approximation error](/source/approximation_error), one has a reliable stopping criterion: repeat the steps until <math>\max\{|z_i|\} - \min\{|z_i|\}</math> is sufficiently small or no longer decreases. These bounds indicate the progress.

==Variants==
Some modifications of the algorithm are present on the literature.<ref>{{Citation |last1=Egidi |first1=Nadaniela |title=A New Remez-Type Algorithm for Best Polynomial Approximation |date=2020 |url=http://link.springer.com/10.1007/978-3-030-39081-5_7 |work=Numerical Computations: Theory and Algorithms |volume=11973 |pages=56–69 |editor-last=Sergeyev |editor-first=Yaroslav D. |place=Cham |publisher=Springer |doi=10.1007/978-3-030-39081-5_7 |isbn=978-3-030-39080-8 |last2=Fatone |first2=Lorella |last3=Misici |first3=Luciano |s2cid=211159177 |editor2-last=Kvasov |editor2-first=Dmitri E.|url-access=subscription }}</ref> These include:

* Replacing more than one sample point with the locations of nearby maximum absolute differences.{{Citation needed|date=March 2022}}
* Replacing all of the sample points with in a single iteration with the locations of all, alternating sign, maximum differences.<ref name="toobs">{{cite journal |last1=Temes |first1=G.C. |last2=Barcilon |first2=V. |last3=Marshall |first3=F.C. |title=The optimization of bandlimited systems |journal=Proceedings of the IEEE |volume=61 |issue=2 |pages=196–234 |date=1973 |doi=10.1109/PROC.1973.9004 |bibcode=1973IEEEP..61..196T |issn=0018-9219}}</ref>
* Using the relative error to measure the difference between the approximation and the function, especially if the approximation will be used to compute the function on a computer which uses [floating point](/source/floating_point) arithmetic;
* Including zero-error point constraints.<ref name="toobs" />
* The Fraser-Hart variant, used to determine the best rational Chebyshev approximation.<ref>{{Cite journal |last=Dunham |first=Charles B. |date=1975 |title=Convergence of the Fraser-Hart algorithm for rational Chebyshev approximation |url=https://www.ams.org/mcom/1975-29-132/S0025-5718-1975-0388732-9/ |journal=Mathematics of Computation |language=en |volume=29 |issue=132 |pages=1078–1082 |doi=10.1090/S0025-5718-1975-0388732-9 |issn=0025-5718|doi-access=free |url-access=subscription }}</ref>

== See also ==
{{Portal|Mathematics}}
* {{annotated link|Hadamard's lemma}}
* {{annotated link|Laurent series}}
* {{annotated link|Padé approximant}}
* {{annotated link|Newton series}}
* {{annotated link|Approximation theory}}
* {{annotated link|Function approximation}}

==References==
{{Reflist}}

==External links==
*[https://www.boost.org/doc/libs/1_47_0/libs/math/doc/sf_and_dist/html/math_toolkit/toolkit/internals2/minimax.html Minimax Approximations and the Remez Algorithm], background chapter in the [Boost](/source/Boost_(C%2B%2B_libraries)) Math Tools documentation, with link to an implementation in C++
*[http://www.bores.com/courses/intro/filters/4_equi.htm Intro to DSP] {{Webarchive|url=https://web.archive.org/web/20140423185007/http://www.bores.com/courses/intro/filters/4_equi.htm |date=2014-04-23 }}
*{{MathWorld|urlname=RemezAlgorithm|title=Remez Algorithm|author1-link=Ronald Aarts|author=Aarts, Ronald M.|author2=Bond, Charles|author3=Mendelsohn, Phil|author4= Weisstein, Eric W.|name-list-style=amp}}

Category:Polynomials
Category:Approximation theory
Category:Numerical analysis

---
Adapted from the Wikipedia article [Remez algorithm](https://en.wikipedia.org/wiki/Remez_algorithm) by Wikipedia contributors ([contributor history](https://en.wikipedia.org/wiki/Remez_algorithm?action=history)). Available under [Creative Commons Attribution-ShareAlike 4.0 International](https://creativecommons.org/licenses/by-sa/4.0/). Changes may have been made.
