In [[graph theory]] and [[theoretical computer science]], the '''colour refinement algorithm''' also known as the '''naive vertex classification''', or the '''1-dimensional version of the [[Weisfeiler Leman graph isomorphism test|Weisfeiler-Leman algorithm]]''', is a routine used for testing whether two graphs are [[Isomorphic graph|isomorphic]].<ref>{{cite book | chapter-url=https://doi.org/10.7551/mitpress/10548.003.0023 | doi=10.7551/mitpress/10548.003.0023 | chapter=Color Refinement and Its Applications | title=An Introduction to Lifted Probabilistic Inference | year=2021 | isbn=9780262365598 | last1=Grohe | first1=Martin | last2=Kersting | first2=Kristian | last3=Mladenov | first3=Martin | last4=Schweitzer | first4=Pascal | s2cid=59069015 }}</ref> While it solves graph isomorphism on almost all graphs, there are graphs such as all regular graphs that cannot be distinguished using colour refinement.

== History == The first appearance of color refinement is in Stephen H. Unger's program GIT for graph isomorphism, where it is called the {{smallcaps|Extend}} method.<ref name="unger">{{cite journal |last=Unger |first=Stephen H. |title=GIT&mdash;A Heuristic Program for Testing Pairs of Directed Line Graphs for Isomorphism |journal=[[Communications of the ACM]] |volume=7 |issue=1 |pages=26–34 |year=1964|doi=10.1145/363872.363899}}</ref> It was described again, immediately after, in a chemistry paper.<ref>{{Cite journal |last=Morgan |first=H. L. |date=1965-05-01 |title=The Generation of a Unique Machine Description for Chemical Structures-A Technique Developed at Chemical Abstracts Service. |url=https://doi.org/10.1021/c160017a018 |journal=Journal of Chemical Documentation |volume=5 |issue=2 |pages=107–113 |doi=10.1021/c160017a018 |issn=0021-9576|url-access=subscription }}</ref>

== Description == The algorithm takes as an input a graph <math> G </math> with <math> n </math> vertices. It proceeds in iterations and in each iteration produces a new colouring of the vertices. Formally a "[[Graph coloring|colouring]]" is a function from the vertices of this graph into some set (of "colours"). In each iteration, we define a sequence of vertex colourings <math> \lambda_i </math> as follows:

* <math> \lambda_0 </math> is the initial colouring. If the graph is unlabelled, the initial colouring assigns a trivial colour <math>\lambda_0(v)</math> to each vertex <math>v</math>. If the graph is labelled, <math>\lambda_0</math> is the label of vertex <math>v</math>. * For all vertices <math>v</math>, we set <math>\lambda_{i+1}(v) = \left(\lambda_i(v), \{\{ \lambda_i(w) \mid w \text{ is a neighbor of } v\}\}\right)</math>.

In other words, the new colour of the vertex <math>v</math> is the pair formed from the previous colour and the [[multiset]] of the colours of its neighbours. This algorithm keeps refining the current colouring. At some point it stabilises, i.e., <math>\lambda_{i+1}(u)=\lambda_{i+1}(v)</math> if and only if <math>\lambda_i(u)=\lambda_i(v)</math>. This final colouring is called the ''stable colouring''.

== Graph Isomorphism ==

Colour refinement can be used as a subroutine for an important [[computational problem]]: [[Graph isomorphism problem|graph isomorphism]]. In this problem we have as input two graphs <math> G, H </math> and our task is to determine whether they are [[Graph isomorphism|isomorphic]]. Informally, this means that the two graphs are the same up to relabelling of vertices.

To test if <math> G </math> and <math> H </math> are isomorphic we could try the following. Run colour refinement on both graphs. If the stable colourings produced are different we know that the two graphs are not isomorphic. However, it could be that the same stable colouring is produced despite the two graphs not being isomorphic; see below.

== Complexity ==

It is easy to see that if colour refinement is given a <math> n </math> vertex graph as input, a stable colouring is produced after at most <math> n-1 </math> iterations. Conversely, there exist graphs where this bound is realised.<ref>{{Citation |last1=Kiefer |first1=Sandra |title=The Iteration Number of Colour Refinement |date=2020-05-20 |arxiv=2005.10182 |last2=McKay |first2=Brendan D.}}</ref> This leads to a <math> O((n+m)\log n) </math> implementation where <math>n </math> is the number of vertices and <math>m </math> the number of edges.<ref>{{Cite journal |last1=Cardon |first1=A. |last2=Crochemore |first2=M. |date=1982-07-01 |title=Partitioning a graph in O(¦A¦log2¦V¦) |journal=Theoretical Computer Science |language=en |volume=19 |issue=1 |pages=85–98 |doi=10.1016/0304-3975(82)90016-0 |issn=0304-3975|doi-access=free }}</ref> This complexity has been proven to be optimal under reasonable assumptions.<ref>{{Cite journal |last1=Berkholz |first1=Christoph |last2=Bonsma |first2=Paul |last3=Grohe |first3=Martin |date=2017-05-01 |title=Tight Lower and Upper Bounds for the Complexity of Canonical Colour Refinement |journal=Theory of Computing Systems |language=en |volume=60 |issue=4 |pages=581–614 |doi=10.1007/s00224-016-9686-0 |s2cid=12616856 |issn=1433-0490|doi-access=free |arxiv=1509.08251 }}</ref>

== Expressivity == We say that two graphs <math> G </math> and <math> H </math> are ''distinguished'' by colour refinement if the algorithm yields a different output on <math> G </math> as on <math> H </math>. There are simple examples of graphs that are not distinguished by colour refinement. For example, it does not distinguish a cycle of length 6 from a pair of triangles (example V.1 in <ref>{{Cite book |last=Grohe |first=Martin |title=2021 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS) |chapter=The Logic of Graph Neural Networks |date=2021-06-29 |chapter-url=https://doi.org/10.1109/LICS52264.2021.9470677 |series=LICS '21 |location=New York, NY, USA |publisher=Association for Computing Machinery |pages=1–17 |doi=10.1109/LICS52264.2021.9470677 |arxiv=2104.14624 |isbn=978-1-6654-4895-6|s2cid=233476550 }}</ref>). Despite this, the algorithm is very powerful in that a [[random graph]] will be identified by the algorithm asymptotically almost surely.<ref>{{cite journal |last1=Babai |first1=László |last2=Erdo˝s |first2=Paul |last3=Selkow |first3=Stanley M. |title=Random Graph Isomorphism |journal=SIAM Journal on Computing |date=August 1980 |volume=9 |issue=3 |pages=628–635 |doi=10.1137/0209047 |url=https://epubs.siam.org/doi/10.1137/0209047 |language=en |issn=0097-5397|url-access=subscription }}</ref> Even stronger, it has been shown that as <math> n </math> increases, the proportion of graphs that are ''not'' identified by colour refinement decreases exponentially in order <math> n </math>.<ref>{{Cite book| last1=Babai |first1=L. | last2= Kucera |first2=K.|title=20th Annual Symposium on Foundations of Computer Science (SFCS 1979) | chapter=Canonical labelling of graphs in linear average time |chapter-url=https://ieeexplore.ieee.org/document/4567999/;jsessionid=3nsdXVGO6TSsLdiJnZQ6slDYPxa-Qyh0XugyK5ti0b5TpRiyrKyo!-452107954 |access-date=2024-01-18 |date=1979 |pages=39–46 |doi=10.1109/SFCS.1979.8 }}</ref>

==Equivalent Characterizations==

For two graphs <math>G</math> and <math>H</math> with the same number of vertices, the following conditions are equivalent: * <math>G</math> and <math>H</math> are indistinguishable by colour refinement. * The [[Fibrations of graphs|minimum fibration bases]] of <math>G</math> and <math>H</math> are isomorphic. * <math>G</math> and <math>H</math> are [[Fractional graph isomorphism|fractionally isomorphic]].<ref>{{cite journal |last1=Tinhofer |first1=Gottfried |title=Graph isomorphism and theorems of Birkhoff type |journal=Computing |date=December 1986 |volume=36 |issue=4 |pages=285–300 |doi=10.1007/BF02240204 |url=https://link.springer.com/article/10.1007/BF02240204|url-access=subscription }}</ref><ref>{{cite journal |last1=Tinhofer |first1=Gottfried |title=A note on compact graphs |journal=Discrete Applied Mathematics |date=February 1991 |volume=30 |issue=2–3 |pages=253–264|doi=10.1016/0166-218X(91)90049-3 |url=https://dx.doi.org/10.1016/0166-218X%2891%2990049-3|url-access=subscription }}</ref> * <math>G</math> and <math>H</math> have a common coarsest [[Fractional graph isomorphism#Equivalence to coarsest equitable partition|equitable partition]]. * <math>G</math> and <math>H</math> have the same [[Covering graph#Universal cover|universal cover]].<ref>{{cite book |last1=Krebs |first1=Andreas |last2=Verbitsky |first2=Oleg |chapter=Universal Covers, Color Refinement, and Two-Variable Counting Logic: Lower Bounds for the Depth |title=2015 30th Annual ACM/IEEE Symposium on Logic in Computer Science |date=2015 |volume=30 |pages=689–700 |doi=10.1109/LICS.2015.69 |isbn=978-1-4799-8875-4 |chapter-url=https://dl.acm.org/doi/10.1109/LICS.2015.69}}</ref> * For all [[Tree (graph theory)|trees]] <math>T</math>, there are an equal number of [[Graph homomorphism|homomorphisms]] from <math>T</math> to <math>G</math> as there are from <math>T</math> to <math>H</math>.<ref>{{cite book |last1=Dell |first1=Holger |last2=Grohe |first2=Martin |last3=Rattan |first3=Gaurav |title=Lovász Meets Weisfeiler and Leman |series=Leibniz International Proceedings in Informatics (LIPIcs) |date=2018 |volume=45 |pages=40:1–40:14 |publisher=Schloss Dagstuhl – Leibniz-Zentrum für Informatik |doi=10.4230/LIPIcs.ICALP.2018.40 |isbn=978-3-95977-076-7 |doi-access=free }}</ref> * <math>G</math> and <math>H</math> cannot be distinguished by the [[Two-variable logic|two variable fragment of first order logic]] with [[Counting quantification|counting]].<ref>Grohe, Martin. "Finite variable logics in descriptive complexity theory." Bulletin of Symbolic Logic 4.4 (1998): 345-398.</ref> * Any [[Graph neural network#Message passing layers|message passing graph neural network]] will map <math>G</math> and <math>H</math> to the same output, if the input node features are the initial colours <math>\lambda_0</math>.<ref>{{Cite conference | last1 = Morris | first1 = Christopher | last2 = Ritzert | first2 = Martin | last3 = Fey | first3 = Matthias | last4 = Hamilton | first4 = William L. | last5 = Lenssen | first5 = Jan Eric | last6 = Rattan | first6 = Gaurav | last7 = Grohe | first7 = Martin | title = Weisfeiler and Leman Go Neural: Higher-Order Graph Neural Networks | book-title = Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence | series = AAAI'19 | publisher = AAAI Press | location = Honolulu, Hawaii, USA | year = 2019 | isbn = 978-1-57735-809-1 | pages = 565–572 | doi = 10.1609/aaai.v33i01.33014602 | url = https://doi.org/10.1609/aaai.v33i01.33014602 | arxiv = 1810.02244 }}</ref><ref>{{Cite conference | last1 = Xu | first1 = Keyulu | last2 = Hu | first2 = Weihua | last3 = Leskovec | first3 = Jure | last4 = Jegelka | first4 = Stefanie | title = How Powerful are Graph Neural Networks? | book-title = International Conference on Learning Representations (ICLR) | year = 2019 | url = https://openreview.net/forum?id=ryGs6iA5Km }}</ref> * Any synchronous anonymous algorithm with broadcast/mailbox message passing in which the input depends on the color only will generate an output that depends, again, on the initial color only.<ref name="boldivigna_effective">{{cite conference |last1=Boldi |first1=Paolo |last2=Vigna |first2=Sebastiano |title=An Effective Characterization of Computability in Anonymous Networks |book-title=Distributed Computing (DISC 2001) |series=Lecture Notes in Computer Science|volume=2180 |year=2001|pages=33–47|publisher=Springer-Verlag|doi=10.1007/3-540-45414-4_3}}</ref>

==References== {{Reflist}}

[[Category:Graph algorithms]]