# Scoring algorithm

> Mediated Wiki article. Canonical URL: https://mediated.wiki/source/Scoring_algorithm
> Markdown URL: https://mediated.wiki/source/Scoring_algorithm.md
> Source: https://en.wikipedia.org/wiki/Scoring_algorithm
> Source revision: 1341267285
> License: Creative Commons Attribution-ShareAlike 4.0 International (https://creativecommons.org/licenses/by-sa/4.0/)

Form of Newton's method used in statistics

**Scoring algorithm**, also known as **Fisher's scoring**,[1] is a form of [Newton's method](/source/Newton's_method) used in [statistics](/source/Statistics) to solve [maximum likelihood](/source/Maximum_likelihood) equations [numerically](/source/Numerical_analysis), named after [Ronald Fisher](/source/Ronald_Fisher).

## Sketch of derivation

Let Y 1 , … , Y n {\displaystyle Y_{1},\ldots ,Y_{n}} be [random variables](/source/Random_variable), independent and identically distributed with twice differentiable [p.d.f.](/source/Probability_density_function) f ( y ; θ ) {\displaystyle f(y;\theta )} , and we wish to calculate the [maximum likelihood estimator](/source/Maximum_likelihood_estimator) (M.L.E.) θ ∗ {\displaystyle \theta ^{*}} of θ {\displaystyle \theta } . First, suppose we have a starting point for our algorithm θ 0 {\displaystyle \theta _{0}} , and consider a [Taylor expansion](/source/Taylor_series) of the [score function](/source/Score_(statistics)), V ( θ ) {\displaystyle V(\theta )} , about θ 0 {\displaystyle \theta _{0}} :

- V ( θ ) ≈ V ( θ 0 ) − J ( θ 0 ) ( θ − θ 0 ) , {\displaystyle V(\theta )\approx V(\theta _{0})-{\mathcal {J}}(\theta _{0})(\theta -\theta _{0}),\,}

where

- J ( θ 0 ) = − ∑ i = 1 n ∇ ∇ ⊤ | θ = θ 0 log ⁡ f ( Y i ; θ ) {\displaystyle {\mathcal {J}}(\theta _{0})=-\sum _{i=1}^{n}\left.\nabla \nabla ^{\top }\right|_{\theta =\theta _{0}}\log f(Y_{i};\theta )}

is the [observed information matrix](/source/Observed_information) at θ 0 {\displaystyle \theta _{0}} . Now, setting θ = θ ∗ {\displaystyle \theta =\theta ^{*}} , using that V ( θ ∗ ) = 0 {\displaystyle V(\theta ^{*})=0} and rearranging gives us:

- θ ∗ ≈ θ 0 + J − 1 ( θ 0 ) V ( θ 0 ) . {\displaystyle \theta ^{*}\approx \theta _{0}+{\mathcal {J}}^{-1}(\theta _{0})V(\theta _{0}).\,}

We therefore use the algorithm

- θ m + 1 = θ m + J − 1 ( θ m ) V ( θ m ) , {\displaystyle \theta _{m+1}=\theta _{m}+{\mathcal {J}}^{-1}(\theta _{m})V(\theta _{m}),\,}

and under certain regularity conditions, it can be shown that θ m → θ ∗ {\displaystyle \theta _{m}\rightarrow \theta ^{*}} .

## Fisher scoring

In practice, J ( θ ) {\displaystyle {\mathcal {J}}(\theta )} is usually replaced by I ( θ ) = E [ J ( θ ) ] {\displaystyle {\mathcal {I}}(\theta )=\mathrm {E} [{\mathcal {J}}(\theta )]} , the [Fisher information](/source/Fisher_information), thus giving us the **Fisher Scoring Algorithm**:

- θ m + 1 = θ m + I − 1 ( θ m ) V ( θ m ) {\displaystyle \theta _{m+1}=\theta _{m}+{\mathcal {I}}^{-1}(\theta _{m})V(\theta _{m})} ..

Under some regularity conditions, if θ m {\displaystyle \theta _{m}} is a [consistent estimator](/source/Consistent_estimator), then θ m + 1 {\displaystyle \theta _{m+1}} (the correction after a single step) is 'optimal' in the sense that its error distribution is asymptotically identical to that of the true max-likelihood estimate.[2]

## See also

- [Score (statistics)](/source/Score_(statistics))

- [Score test](/source/Score_test)

- [Fisher information](/source/Fisher_information)

## References

1. **[^](#cite_ref-1)** Longford, Nicholas T. (1987). "A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects". *Biometrika*. **74** (4): 817–827. [doi](/source/Doi_(identifier)):[10.1093/biomet/74.4.817](https://doi.org/10.1093%2Fbiomet%2F74.4.817).

1. **[^](#cite_ref-2)** Li, Bing; Babu, G. Jogesh (2019), ["Bayesian Inference"](https://dx.doi.org/10.1007/978-1-4939-9761-9_6), *Springer Texts in Statistics*, New York, NY: Springer New York, Theorem 9.4, [doi](/source/Doi_(identifier)):[10.1007/978-1-4939-9761-9_6](https://doi.org/10.1007%2F978-1-4939-9761-9_6), [ISBN](/source/ISBN_(identifier)) [978-1-4939-9759-6](https://en.wikipedia.org/wiki/Special:BookSources/978-1-4939-9759-6), [S2CID](/source/S2CID_(identifier)) [239322258](https://api.semanticscholar.org/CorpusID:239322258), retrieved 2023-01-03{{[citation](https://en.wikipedia.org/wiki/Template:Citation)}}: CS1 maint: work parameter with ISBN ([link](https://en.wikipedia.org/wiki/Category:CS1_maint:_work_parameter_with_ISBN))

## Further reading

- Jennrich, R. I. & Sampson, P. F. (1976). ["Newton-Raphson and Related Algorithms for Maximum Likelihood Variance Component Estimation"](https://www.tandfonline.com/doi/abs/10.1080/00401706.1976.10489395). *[Technometrics](/source/Technometrics)*. **18** (1): 11–17. [doi](/source/Doi_(identifier)):[10.1080/00401706.1976.10489395](https://doi.org/10.1080%2F00401706.1976.10489395) (inactive 12 July 2025). [JSTOR](/source/JSTOR_(identifier)) [1267911](https://www.jstor.org/stable/1267911).{{[cite journal](https://en.wikipedia.org/wiki/Template:Cite_journal)}}: CS1 maint: DOI inactive as of July 2025 ([link](https://en.wikipedia.org/wiki/Category:CS1_maint:_DOI_inactive_as_of_July_2025))

v t e Optimization: Algorithms, methods, and heuristics Unconstrained nonlinear Functions Golden-section search Powell's method Line search Nelder–Mead method Successive parabolic interpolation Gradients Convergence Trust region Wolfe conditions Quasi–Newton Berndt–Hall–Hall–Hausman Broyden–Fletcher–Goldfarb–Shanno and L-BFGS Davidon–Fletcher–Powell Symmetric rank-one (SR1) Other methods Conjugate gradient Gauss–Newton Gradient Mirror Levenberg–Marquardt Powell's dog leg method Truncated Newton Hessians Newton's method Optimization computes maxima and minima. Constrained nonlinear General Barrier methods Penalty methods Differentiable Augmented Lagrangian methods Sequential quadratic programming Successive linear programming Convex optimization Convex minimization Cutting-plane method Reduced gradient (Frank–Wolfe) Subgradient method Linear and quadratic Interior point Affine scaling Ellipsoid algorithm of Khachiyan Projective algorithm of Karmarkar Basis-exchange Simplex algorithm of Dantzig Revised simplex algorithm Criss-cross algorithm Principal pivoting algorithm of Lemke Active-set method Combinatorial Paradigms Approximation algorithm Dynamic programming Greedy algorithm Integer programming Branch and bound/cut Graph algorithms Minimum spanning tree Borůvka Prim Kruskal Shortest path Bellman–Ford SPFA Dijkstra Floyd–Warshall Network flows Dinic Edmonds–Karp Ford–Fulkerson Push–relabel maximum flow Metaheuristics Evolutionary algorithm Hill climbing Local search Parallel metaheuristics Simulated annealing Spiral optimization algorithm Tabu search Software

---
Adapted from the Wikipedia article [Scoring algorithm](https://en.wikipedia.org/wiki/Scoring_algorithm) by Wikipedia contributors ([contributor history](https://en.wikipedia.org/wiki/Scoring_algorithm?action=history)). Available under [Creative Commons Attribution-ShareAlike 4.0 International](https://creativecommons.org/licenses/by-sa/4.0/). Changes may have been made.