Corecursion

{{Short description|Type of algorithm in computer science}} {{distinguish|Mutual recursion}} {{Merge portions from|Coinduction|discuss=Talk:Corecursion#Merge proposal|date=March 2026}} In computer science, '''corecursion''' is a type of operation that is dual to (structural) recursion. Whereas recursion consumes a data structure by first handling the topmost layer before descending into its inner parts, corecursion produces a data structure by first defining the topmost layer before defining its inner parts. Corecursion is a particularly important in total languages, as it allows encoding potentially non-terminating computations in a context where every function must terminate. It is supported by theorem provers Agda<ref>{{Cite web |last=Agda Authors |title=Coinduction |url=https://agda.readthedocs.io/en/stable/language/coinduction.html |url-status=live |archive-url=https://web.archive.org/web/20260218012134/https://agda.readthedocs.io/en/stable/language/coinduction.html#coinduction |archive-date=2026-02-18 |access-date=2026-03-13 |website=Agda 2.8.0 documentation}}</ref> and Rocq<ref>{{Cite web |last=((Inria, CNRS and contributors)) |title=Coinductive types and corecursive functions |url=https://rocq-prover.org/doc/V9.1.0/refman/language/core/coinductive.html |access-date=2026-03-13 |website=The Rocq Prover 9.1.0 documentation}}</ref>.

Both corecursion and recursion can be thought of as operating on trees, which include data structures like lists and streams as special cases. Since recursion must terminate, it only works on trees that are well-founded, i.e. not infinitely deep, which are called ''data'' or ''initial data types''; on the other hand, corecursion produces ''codata'' or ''final data types'', which includes infinitely deep trees. Codata cannot be represented in memory directly, so is often implemented using self-referential data structures or lazy evaluation.

== Data and codata == In total programming languages, the natural numbers may be defined as follows (using Haskell syntax{{efn|Haskell, being a partial and lazy language, does not support either data or codata; we use its syntax for familiarity.}}):

This states that every natural number is either zero or the successor of an existing natural number. For example, the number one is represented as <code>Succ Zero</code>, two as <code>Succ (Succ Zero)</code>, three as <code>Succ (Succ (Succ Zero))</code> and so on.

If we interpret the above declaration in an ''inductive'' way, then all natural numbers are generated like this, and we get our familiar set of natural numbers. Importantly, we can do recursion{{efn|"Recursion" here is used in the technical sense of structural recursion, which is a type of recursion that analyses the structure of a type and always terminates. In the more general sense of the word used in computer science, corecursion is also a type of recursion.}} on this set: for example, we can use it to make a list that repeats a value <code>n</code> times:

<syntaxhighlight lang="haskell"> repeat :: a -> Nat -> [a] repeat x Zero = [] -- Base case repeat x (Succ n) = x : repeat x n -- Inductive step </syntaxhighlight>

Note that the use of <code>repeat x n</code> is crucial—we could not have written <code>repeat x (Succ n)</code>, because that is the function we are trying to define, and therefore it would loop forever. More specifically, the input must get smaller—or rather deeper, if seen as a data structure—each time we recurse.

The declaration may also be interpreted in a ''coinductive'' way, which may be denoted as:

<syntaxhighlight lang="haskell"> codata CoNat = Zero | Succ CoNat </syntaxhighlight>

<code>CoNat</code> has all the natural numbers that <code>Nat</code> has automatically. But because codata can be infinitely deep, it has an extra term <code>Succ (Succ (Succ (Succ …)))</code> that continues on indefinitely—often denoted ∞.<ref name="conat">{{Cite book |last1=Xie |first1=Szumi |last2=Bense |first2=Viktor |chapter=The Conatural Numbers Form an Exponential Commutative Semiring |date=2025-10-09 |title=Proceedings of the 10th ACM SIGPLAN International Workshop on Type-Driven Development |chapter-url=https://doi.org/10.1145/3759538.3759654 |series=TyDe '25 |location=New York, NY, USA |publisher=Association for Computing Machinery |pages=52–63 |doi=10.1145/3759538.3759654 |isbn=979-8-4007-2163-2}}</ref> This is unique to the conatural numbers, so it can only be constructed using corecursion:

<syntaxhighlight lang="haskell"> infinity :: CoNat infinity = Succ infinity </syntaxhighlight>

While recursion requires that the input of the recursive call must get smaller each time, corecursion requires that the output of the recursive call must get larger each time. This is why we must write <code>Succ infinity</code>; just <code>infinity = infinity</code> would be disallowed as it loops forever.

To illustrate the duality between recursion and corecursion, we can encapsulate them in two functions, <code>rec</code> and <code>corec</code>. These functions' existence, together with the regular two constructors of <code>Nat</code> and <code>CoNat</code>, uniquely identify their respective types, and therefore allow rewriting all forms of recursion and corecursion in terms of them.<ref name="coalgebra" />

<syntaxhighlight lang="haskell">rec :: Nat -> (Maybe c -> c) -> c rec Zero f = f Nothing rec (Succ n) f = f (Just (rec n f))

corec :: (c -> Maybe c) -> c -> CoNat corec f base = case f base of Nothing -> Zero Just x -> Succ (corec f x)</syntaxhighlight>

Conceptually, <code>CoNat</code> can be thought of as a state machine, where <code>c</code> is a type encapsulating the "current state". The function <code>c -> Maybe c</code> is a state transition function that either results in termination with <code>Zero</code> or continuation with <code>Succ</code>. We can rewrite the examples above using the functions:

<syntaxhighlight lang="haskell">repeat x n = rec n (\ s -> case s of Nothing -> [] Just a -> x : a) infinity = corec (\ () -> Just ()) () </syntaxhighlight>

A more complex example of codata is that of binary trees, where each node is either a leaf node, holding some data, or is a branch node that has exactly two children:

<syntaxhighlight lang="haskell"> codata BinaryTree a = Leaf a | Branch (BinaryTree a) (BinaryTree a) </syntaxhighlight>

This example has many more infinitely deep terms that can only be constructed by corecursion. For example, it can be infinitely deep only on the left-hand side, or on both sides, or alternating left and right:

<syntaxhighlight lang="haskell"> infiniteLeft :: a -> BinaryTree a infiniteLeft x = Branch (infiniteLeft x) (Leaf x)

infiniteBoth :: BinaryTree a infiniteBoth = Branch infiniteBoth infiniteBoth

infiniteAlternating :: a -> Bool -> BinaryTree a infiniteAlternating x False = Branch (infiniteAlternating x True) (Leaf x) infiniteAlternating x True = Branch (Leaf x) (infiniteAlternating x False) </syntaxhighlight>

Again, corecursion on binary trees has a more general form based on a "state machine"-like constructor:

<syntaxhighlight lang="haskell"> corec :: (c -> Either a (c, c)) -> c -> BinaryTree a corec f base = case f base of Left x -> Leaf x Right (x, y) -> Branch (corec f x) (corec f y) </syntaxhighlight>

=== M-types === In a dependently-typed setting, codata can be encoded using M-types, which are dual to W-types. Given a type ''A'' and an ''A''-indexed family of types ''B'', one can form the M-type <math>\mathsf{M}_{a:A}B(a)</math>, representing the type of trees whose nodes are labelled with elements of ''A'' and who child nodes are indexed by the set ''B''(''a''). In pseudocode, M-types (and their corecursion principle) may be defined as:<syntaxhighlight lang="haskell"> codata M a (b :: a -> *) = M { root :: a , branch :: b root -> M a b } corec :: (c -> (root :: a, b root -> c)) -> c -> M a b corec f base = let (root, branch) = f base in M { root = root, branch = \ i -> corec f (branch i) } </syntaxhighlight>As an example, the natural and conatural numbers may be constructed as W- and M-types of the same function:<syntaxhighlight lang="haskell"> f :: Bool -> * f False = Void -- Zero case (no branches) f True = () -- Succ case (one branch)

Nat = W Bool f CoNat = M Bool f </syntaxhighlight>M-types can be proven to exist in many elementary topoi, and their existence follows from the existence of W-types.<ref name="mtype">{{Cite journal |last1=van den Berg |first1=Benno |last2=De Marchi |first2=Federico |date=2007-04-01 |title=Non-well-founded trees in categories |url=https://www.sciencedirect.com/science/article/pii/S0168007207000024 |journal=Annals of Pure and Applied Logic |volume=146 |issue=1 |pages=40–59 |doi=10.1016/j.apal.2006.12.001 |issn=0168-0072}}</ref>

== Mathematical description == The above section showed that there is an intimate link between natural and conatural numbers and <code>Maybe c</code>, and similarly between binary trees and <code>Either a (c, c)</code>. This link is made precise by showing that <code>Nat</code> is in bijection with <code>Maybe Nat</code>, and <code>CoNat</code> with <code>Maybe CoNat</code>; explicitly, this bjiection associates <code>Zero</code> with <code>Nothing</code> and <code>Succ n</code> with <code>Just n</code>. In other words, <code>Nat</code> and <code>CoNat</code> are fixed points (up to isomorphism) of the map <math>X \mapsto X + 1</math>.

The difference between data and codata is that data is the least such fixed point, while codata is the greatest. In this way, recursion can be summarized as the statement that "every element of the type was generated only by the given constructors (<code>Zero</code> and <code>Succ</code> in the case of <code>Nat</code>/<code>CoNat</code>)", while corecursion states that "every value that can be analysed using the constructors as cases exists".

From a category-theoretic perspective, <code>CoNat</code> can be precisely defined as the final coalgebra of the endofunctor<math>X \mapsto X + 1</math>.<ref name="coalgebra">{{Citation |last1=Jacobs |first1=Bart |chapter=An introduction to (co)algebra and (co)induction |date=2011 |title=Advanced Topics in Bisimulation and Coinduction |pages=38–99 |editor-last=Sangiorgi |editor-first=Davide |url=https://www.cambridge.org/core/books/advanced-topics-in-bisimulation-and-coinduction/an-introduction-to-coalgebra-and-coinduction/66DC248BC84647119DE50988F591481A |access-date=2026-03-14 |series=Cambridge Tracts in Theoretical Computer Science |place=Cambridge |publisher=Cambridge University Press |doi=10.1017/CBO9780511792588.003 |isbn=978-1-107-00497-9 |last2=Rutten |first2=Jan |editor2-last=Rutten |editor2-first=Jan}}</ref> Unfolding this definition, we have that:

* There is a function <code>pred :: CoNat -> Maybe CoNat</code>. "Pred" stands for "predecessor", and the intuition is that it subtracts one from the conatural number or gives <code>Nothing</code> otherwise. In other words, <code>pred Zero = Nothing</code> and <code>pred (Succ n) = Just n</code>. This shows that <code>CoNat</code> is a coalgebra of the functor, but not necessarily the final one. * For every type <code>c</code> and function <code>f :: c -> Maybe c</code>, there is a function <code>corec f :: c -> CoNat</code> such that <code>pred (corec f x) = fmap (corec f) (f x)</code>.<br/><br/>Breaking this down further, if <code>f x = Just y</code> then <code>corec f x = Succ (corec f y)</code>, while if <code>f x = Nothing</code> then <code>corec f x = Zero</code>—this matches the definition of <code>corec</code> we are familiar with. *<code>corec</code> is ''unique'' in the following sense: for any other function <code>g :: c -> CoNat</code> that also satisfies <code>pred (g x) = fmap g (f x)</code>, <code>g x = corec f x</code>. This ensures that <code>CoNat</code> is a ''final'' coalgebra. <code>Maybe c</code> can be replaced with any other functor in the above description to obtain the definition of any other coinductive type.

Not all functors have final coalgebras. For example, Cantor's theorem tells us that no set can be in bijection with its powerset, and therefore the powerset functor <code>a -> Bool</code> has no final coalgebra. However, in the case of polynomial functors or quotient polynomial functors<ref>{{Cite journal |last1=Avigad |first1=Jeremy |last2=Carneiro |first2=Mario |last3=Hudon |first3=Simon |date=2019 |editor-last=Harrison |editor-first=John |editor2-last=O'Leary |editor2-first=John |editor3-last=Tolmach |editor3-first=Andrew |title=Data Types as Quotients of Polynomial Functors |journal=Lipics, Volume 141, Itp 2019 |series=Leibniz International Proceedings in Informatics (LIPIcs) |location=Dagstuhl, Germany |publisher=Schloss Dagstuhl – Leibniz-Zentrum für Informatik |volume=141 |pages=6:1–6:19 |doi=10.4230/LIPIcs.ITP.2019.6 |isbn=978-3-95977-122-1 |doi-access=free}}</ref>, final coalgebras always exist; polynomial functors are functors that can be expressed in the form <math>\textstyle X \mapsto \sum_{a : \alpha} X^{\beta(a)}</math> for a type ''α'' and ''α''-indexed type family ''β''(''a''). The M-type is exactly the construction of this final coalgebra.<ref name="mtype" />

== Coinduction == While one can reason about coinductive data types using the uniqueness of <code>corec</code> directly, it is often more convenient to use '''coinduction''', which can prove equalities of codata. Given a final coalgebra ''A'' of a polynomial functor ''F''—i.e. ''<math>\textstyle F(X) = \sum_{a : \alpha} X^{\beta(a)}</math>''— we say that a relation <math>R \subseteq A \times A</math> is a bisimulation if, for all <math>(a, b) \in R</math>, <math>\text{root}(a) = \text{root}(b)</math> and <math>\text{branch}(a, i)R\text{branch}(b, i)</math> for all <math>i : B(\text{root}(a))</math>. Here, <math>\text{root}</math> and <math>\text{branch}</math> refer to the maps:

<math>\begin{align} \text{root} &: A \rightarrow \alpha \\ \text{branch} &: \prod_{a : A} \beta(\text{root}(a)) \rightarrow A \end{align}</math>

The principle of coinduction states that for all bisimulations ''R'' on ''A'' and <math>(a, b) \in R</math>, <math>a = b</math>.

More generally, for a coalgebra map <math>\text{get} : A \rightarrow F(A)</math>, a relation ''R'' on ''A'' is a bisimulation if there exists a function <math>f : R \rightarrow F(R)</math> satisfying, for all <math>(a, b) \in R</math>,

* <math>\text{get}(a) = F(\pi_1)(f(a, b))</math> and * <math>\text{get}(b) = F(\pi_2)(f(a, b))</math>,

where ''π''₁ and ''π''₂ are the first and second projection maps <math>R \rightarrow A</math>. It can be seen this equivalent to the polynomial case by setting <math>f(a, b) = (\text{root}(a), i \mapsto (\text{branch}(a, i), \text{branch}(b, i)))</math>. The required equalities can also be expressed as a commutative diagram:

frameless|320x320px

Coinduction is easy to prove from the uniqueness of <code>corec</code>: ''π''₁ and ''π''₂ both satisfy the required property to be equal to <code>corec f</code>, and are therefore equal to each other, thus showing that <math>a = \pi_1(a, b) = \pi_2(a, b) = b</math>.<ref name="coalgebra" />

=== Example: Conatural number addition === To demonstrate the usage of coinduction, we will prove some basic properties about addition on conatural numbers. Addition can be defined identically to how it is defined for natural numbers:<ref name="conat" /><syntaxhighlight lang="haskell">add :: CoNat -> CoNat -> CoNat add a Zero = a add a (Succ b) = Succ (add a b)</syntaxhighlight>This definition, while valid, hides some amount of complexity. We may alternatively write:<syntaxhighlight lang="haskell"> add Zero Zero = Zero add (Succ a) Zero = Succ (add a Zero) add a (Succ b) = Succ (add a b) </syntaxhighlight>This definition makes the translation to <code>corec</code> straightforward:<syntaxhighlight lang="haskell"> iterate :: (CoNat, CoNat) -> Maybe (CoNat, CoNat) iterate Zero Zero = Nothing iterate (Succ a) Zero = Just (a, Zero) iterate a (Succ b) = Just (a, b)

add a b = corec iterate (a, b) </syntaxhighlight>We can prove that <math>a + 0 = a</math> by coinduction on the relation <math>\{(a + 0, a) | a \in \N^\infty\}</math> (where <math>\N^\infty</math> is the set of conatural numbers). First, it follows from the definition of addition that <math>a + 0 = 0 \iff a = 0</math>; second, if we assume that <math>(x + 1, y + 1)</math> is of the form <math>(a + 0, a)</math> for some ''a'', we must show that <math>(x, y)</math> is also of this form. We have: <math display="block">x + 1 = a + 0 = (y + 1) + 0 = (y + 0) + 1</math>Thus <math>x = y + 0</math>, and therefore <math>(x, y) = (y + 0, y)</math> as required.

The identity <math>(a + 1) + b = (a + b) + 1</math> may be similarly proven by coinduction on <math>\{(((a + 1) + b) - 1, a + b) | a, b \in \N^\infty\}</math> (remember the relation must necessarily include <math>(0, 0)</math>); finally, commutativity—<math>a + b = b + a</math>—results from similar straightforward coinduction on <math>\{(a + b, b + a)|a,b \in \N^\infty\}</math>.<ref>{{Cite journal |last=Rutten |first=J. J. M. M. |date=2000-10-17 |title=Universal coalgebra: a theory of systems |journal=Theoretical Computer Science |series=Modern Algebra |volume=249 |issue=1 |pages=3–80 |doi=10.1016/S0304-3975(00)00056-6 |issn=0304-3975 |doi-access=free}}</ref>

==In programming languages== If the domain of discourse is the category of sets and total functions (such as in theorem provers like Agda and Rocq), then final types (codata) may contain infinite, non-wellfounded values, whereas initial types (data) do not.<ref>Barwise and Moss 1996.</ref><ref>Moss and Danner 1997.</ref> On the other hand, if the domain of discourse is the category of complete partial orders and continuous functions, which corresponds roughly to the Haskell programming language, then final types coincide with initial types, and the corresponding final coalgebra and initial algebra form an isomorphism.<ref>Smyth and Plotkin 1982.</ref>

== History == Corecursion, referred to as ''circular programming,'' dates at least to {{Harv|Bird|1984}}, who credits John Hughes and Philip Wadler; more general forms were developed in {{Harv|Allison|1989}}. The original motivations included producing more efficient algorithms (allowing a single pass over data in some cases, instead of requiring multiple passes) and implementing classical data structures, such as doubly linked lists and queues, in functional languages.

== See also == * Bisimulation * Coinduction * Recursion * Anamorphism

== Notes == {{notelist}}

== References == {{Reflist|2}} {{refbegin}} * {{Cite journal | last = Bird | first = Richard Simpson | author-link = Richard Bird (computer scientist)| title = Using circular programs to eliminate multiple traversals of data | doi = 10.1007/BF00264249 | journal = Acta Informatica | volume = 21 | issue = 3 | pages = 239–250 | year = 1984 | s2cid = 27392591 }} * {{cite journal | doi = 10.1002/spe.4380190202 | first = Lloyd |last=Allison | date = April 1989 | title = Circular Programs and Self-Referential Structures | url = http://www.csse.monash.edu.au/~lloyd/tildeFP/1989SPE/ | journal = Software: Practice and Experience | volume = 19 | issue = 2 | pages = 99–109 | s2cid = 21298473 | arxiv = 2403.01866 }} * {{cite tech report | author = Geraint Jones and Jeremy Gibbons | title = Linear-time breadth-first tree algorithms: An exercise in the arithmetic of folds and zips | institution = Dept of Computer Science, University of Auckland | year = 1992 | url = http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.5446 }} * {{cite book |author1=Jon Barwise |author2=Lawrence S. Moss |title=Vicious Circles |url=http://www.press.uchicago.edu/presssite/metadata.epl?mode=synopsis&bookkey=3630257 |publisher=Center for the Study of Language and Information |date=June 1996 |isbn=978-1-57586-009-1 |author1-link=Jon Barwise |access-date=2011-01-24 |archive-url=https://web.archive.org/web/20100621142601/http://www.press.uchicago.edu/presssite/metadata.epl?mode=synopsis&bookkey=3630257 |archive-date=2010-06-21 |url-status=dead }} * {{cite journal | doi = 10.1093/jigpal/5.2.231 |author1=Lawrence S. Moss |author2=Norman Danner | title = On the Foundations of Corecursion | journal = Logic Journal of the IGPL | volume = 5 | issue = 2 | pages = 231–257 | year = 1997 |citeseerx=10.1.1.40.4243 }} * {{cite book |author1=Kees Doets |author2=Jan van Eijck | title = The Haskell Road to Logic, Maths, and Programming | url = http://homepages.cwi.nl/~jve/HR/ | publisher = King's College Publications | date = May 2004 | isbn = 978-0-9543006-9-2 }} * {{cite journal | author = David Turner | date = 2004-07-28 | title = Total Functional Programming | url = http://www.jucs.org/jucs_10_7/total_functional_programming | journal = Journal of Universal Computer Science | volume = 10 | issue = 7 | pages = 751–768 | doi = 10.3217/jucs-010-07-0751 | author-link = David Turner (computer scientist) }} * {{cite journal |author1=Jeremy Gibbons |author2=Graham Hutton | title = Proof methods for corecursive programs | journal = Fundamenta Informaticae | volume = 66 | issue = 4 | pages = 353–366 | date = April 2005 | url = http://www.cs.nott.ac.uk/~gmh/bib.html#corecursion }} * {{citation | author = Leon P. Smith | date = 2009-07-29 | title = Lloyd Allison's Corecursive Queues: Why Continuations Matter | url = http://themonadreader.wordpress.com/2009/07/29/issue-14/ | journal = The Monad Reader | issue = 14 | pages = 37–68 }} * {{cite web | author = Raymond Hettinger | title = Recipe 576961: Technique for cyclical iteration | url = http://code.activestate.com/recipes/576961/ | date = 2009-11-19 }} * {{cite journal | author = M. B. Smyth and G. D. Plotkin | year = 1982 | title = The Category-Theoretic Solution of Recursive Domain Equations | journal = SIAM Journal on Computing | volume = 11 | issue = 4 | pages = 761–783 | doi = 10.1137/0211062 | s2cid = 8517995 | url = http://wrap.warwick.ac.uk/46312/1/WRAP_Smyth_cs-rr-014.pdf }} * {{cite book |author1=Leclerc, Francois |author2=Paulin-Mohring, Christine|author2link = Christine Paulin-Mohring | title = Programming with Streams in Coq: A Case Study: the Sieve of Eratosthenes | series = Types for Proofs and Programs: International Workshop TYPES '93. | year = 1993 | isbn = 978-3-540-58085-0 | pages = 191–212 | url = http://dl.acm.org/citation.cfm?id=189973.189981 | publisher = Springer-Verlag New York, Inc. }} {{refend}}

Category:Theoretical computer science Category:Self-reference Category:Articles with example Haskell code Category:Articles with example Python (programming language) code Category:Functional programming Category:Category theory Category:Recursion