{{Short description|Graphical representation of music}} thumb|400px|right|(a) Musical score of a C-major scale. (b) Chromagram obtained from the score. (c) Audio recording of the C-major scale played on a piano. (d) Chromagram obtained from the audio recording.
In Western music, the term '''''chroma feature''''' or '''''chromagram''''' closely relates to twelve different pitch classes. Chroma-based features, which are also referred to as "pitch class profiles", are a powerful tool for analyzing music whose pitches can be meaningfully categorized (often into twelve categories) and whose tuning approximates to the equal-tempered scale. One main property of chroma features is that they capture harmonic and melodic characteristics of music, while being robust to changes in timbre and instrumentation.
==Definition==
The underlying observation is that humans perceive two musical pitches as similar in color if they differ by an octave. Based on this observation, a pitch can be separated into two components, which are referred to as ''tone height'' and ''chroma''.<ref name=Shepard64_pitch_ASA> {{cite journal |last=Shepard |first=Roger N. |title=Circularity in judgments of relative pitch |journal=Journal of the Acoustical Society of America |volume=36 |issue=212 |date=1964 |pages=2346–2353|doi=10.1121/1.1919362 |bibcode=1964ASAJ...36.2346S }} </ref> Assuming the equal-tempered scale, one considers twelve chroma values represented by the set
:{C, C{{music|#}}, D, D{{music|#}}, E, F, F{{music|#}}, G, G{{music|#}}, A, A{{music|#}}, B}
that consists of the twelve pitch spelling attributes as used in Western music notation. Note that in the equal-tempered scale different pitch spellings such C{{music|#}} and D{{music|b}} refer to the same chroma. Enumerating the chroma values, one can identify the set of chroma values with the set of integers {1,2,...,12}, where 1 refers to chroma C, 2 to C{{music|#}}, and so on. A pitch class is defined as the set of all pitches that share the same chroma. For example, using the scientific pitch notation, the pitch class corresponding to the chroma C is the set
:{..., C<sub>−2</sub>, C<sub>−1</sub>, C<sub>0</sub>, C<sub>1</sub>, C<sub>2</sub>, C<sub>3</sub> ...}
consisting of all pitches separated by an integer number of octaves. Given a music representation (e.g. a musical score or an audio recording), the main idea of chroma features is to aggregate for a given local time window (e.g. specified in beats or in seconds) all information that relates to a given chroma into a single coefficient. Shifting the time window across the music representation results in a sequence of chroma features each expressing how the representation's pitch content within the time window is spread over the twelve chroma bands. The resulting time-chroma representation is also referred to as chromagram. The figure above shows chromagrams for a C-major scale, once obtained from a musical score and once from an audio recording. Because of the close relation between the terms chroma and pitch class, chroma features are also referred to as pitch class profiles.
==Applications== Identifying pitches that differ by an octave, chroma features show a high degree of robustness to variations in timbre and closely correlate to the musical aspect of harmony. This is the reason why chroma features are a well-established tool for processing and analyzing music data.<ref name=Mueller15_FundamentalsMusicProcessig_SPRINGER> {{cite book | last = Müller | first = Meinard | title = Fundamentals of Music Processing | url = http://www.music-processing.de | publisher = Springer | year = 2015 | doi = 10.1007/978-3-319-21945-5 | isbn = 978-3-319-21944-8| s2cid = 8691186 }} </ref> For example, basically every chord recognition procedure relies on some kind of chroma representation.<ref name=ChoB14_Chord_IEEE-TASLP> {{cite journal |last1=Cho |first1=Taemin |last2=Bello |first2=Juan Pablo |title=On the Relative Importance of Individual Components of Chord Recognition Systems |journal=IEEE/ACM Transactions on Audio, Speech, and Language Processing |volume=22 |issue=2 |year=2014 |pages=477–4920|doi=10.1109/TASLP.2013.2295926 |bibcode=2014ITASL..22..477C |s2cid=16434636 }}</ref><ref name=MauchD10_SimultaneousEstimation_TASLP> {{cite journal |last1=Mauch |first1=Matthias |last2=Dixon |first2=Simon |title=Simultaneous estimation of chords and musical context from audio |journal=IEEE Transactions on Audio, Speech, and Language Processing |volume=18 |issue=6 |year=2010 |pages=138–153|doi=10.1109/TASL.2009.2032947 |bibcode=2010ITASL..18.1280M |citeseerx=10.1.1.414.7800 |s2cid=15866073 }} </ref><ref name=Fujishima99_ChordRecognition_ICMC> {{cite journal |last=Fujishima |first=Takuya |title=Realtime Chord Recognition of Musical Sound: a System Using Common Lisp Music |journal=Proceedings of the International Computer Music Conference |year=1999 |pages=464–467}} </ref><ref name=JiangGKM11_Chord_AES> {{cite journal |last1=Jiang |first1=Nanzhu |last2=Grosche |first2=Peter |last3=Konz |first3=Verena |last4=Müller |first4=Meinard |title=Analyzing Chroma Feature Types for Automated Chord Recognition |url = https://www.audiolabs-erlangen.de/content/05-fau/professor/00-mueller/03-publications/2011_JiangGroscheKonzMueller_ChordRecognitionEvaluation_AES42-Ilmenau.pdf |journal=Proceedings of the AES Conference on Semantic Audio |year=2011}} </ref> Also, chroma features have become the de facto standard for tasks such as music alignment and synchronization<ref name=HuDT03_audiomatching_WASPAA> {{cite journal |last1=Hu |first1=Ning |last2=Dannenberg |first2=Roger B. |last3=Tzanetakis |first3=George |title=Polyphonic Audio Matching and Alignment for Music Retrieval |journal=Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics |year=2003}} </ref><ref name=EwertMG09_HighResAudioSync_ICASSP> {{cite book |last1= Ewert |first1=Sebastian |last2=Müller |first2=Meinard |last3=Grosche |first3=Peter |title=2009 IEEE International Conference on Acoustics, Speech and Signal Processing |chapter=High resolution audio synchronization using chroma onset features |chapter-url = https://www.audiolabs-erlangen.de/content/05-fau/professor/00-mueller/03-publications/2009_EwertMuellerGrosche_HighResAudioSync_ICASSP.pdf |year=2009 |pages=1869–1872|doi=10.1109/ICASSP.2009.4959972 |isbn=978-1-4244-2353-8 |s2cid=16952895 }} </ref> as well as audio structure analysis.<ref name=PaulusMK10_MusicStructure-STAR_ISMIR> {{cite journal |last1=Paulus |first1=Jouni |last2=Müller |first2=Meinard |last3=Klapuri |first3=Anssi |title=Audio-based Music Structure Analysis |url = https://ismir2010.ismir.net/proceedings/ismir2010-107.pdf |journal=Proceedings of the International Conference on Music Information Retrieval |year=2010 |pages=625–636}} </ref> Finally, chroma features have turned out to be a powerful mid-level feature representation in content-based audio retrieval such as cover song identification,<ref name=EllisP07_CoverSong_ICASSP> {{cite journal |last1=Ellis |first1=Daniel P.W. |last2=Poliner |first2=Graham |title=Identifying 'Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking |journal=Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing |year=2007}}</ref><ref name=SerraGHS08_CoverSong_IEEE-TASLP> {{cite journal |last1=Serrà |first1=Joan |last2=Gómez |first2=Emilia |last3=Herrera |first3=Perfecto |last4=Serra |first4=Xavier |title=Chroma Binary Similarity and Local Alignment Applied to Cover Song Identification |journal=IEEE Transactions on Audio, Speech, and Language Processing |volume=16 |issue=6 |year=2008 |pages=1138–1151|doi=10.1109/TASL.2008.924595 |bibcode=2008ITASL..16.1138S |hdl=10230/16277 |s2cid=10078274 |hdl-access=free }} </ref> audio matching<ref name=MuellerKC05_ChromaFeatures_ISMIR> {{cite journal |last1=Müller |first1=Meinard |last2=Kurth |first2=Frank |last3=Clausen |first3=Michael |title=Audio Matching via Chroma-Based Statistical Features |url = https://ismir2005.ismir.net/proceedings/1019.pdf |journal=Proceedings of the International Conference on Music Information Retrieval |year=2005 |pages=288–295}} </ref><ref name=KurthM08_IndexBasedAudioMatching_TASLP> {{cite journal |last1=Kurth |first1=Frank |last2=Müller |first2=Meinard |title=Efficient Index-Based Audio Matching |journal=IEEE Transactions on Audio, Speech, and Language Processing |volume=16 |issue=2 |year=2008 |pages=382–395|doi=10.1109/TASL.2007.911552 |bibcode=2008ITASL..16..382K |s2cid=206601781 }} </ref><ref name="Mueller15_Chapter3FMP_SPRINGER2">{{cite book|url=http://www.music-processing.de|title=Music Synchronization. In Fundamentals of Music Processing, chapter 3, pages 115-166|last=Müller|first=Meinard|publisher=Springer|year=2015|isbn=978-3-319-21944-8}}</ref><ref name="KurthM08_IndexBasedAudioMatching_TASLP3">{{cite journal|last1=Kurth|first1=Frank|last2=Müller|first2=Meinard|year=2008|title=Efficient Index-Based Audio Matching|journal=IEEE Transactions on Audio, Speech, and Language Processing|volume=16|issue=2|pages=382–395|doi=10.1109/TASL.2007.911552|bibcode=2008ITASL..16..382K |s2cid=206601781 }}</ref> or audio hashing.<ref name="YY10">{{cite book |last1=Yu |first1=Yi |title=Proceedings of the international conference on Multimedia - MM '10 |last2=Crucianu |first2=Michel |last3=Oria |first3=Vincent |last4=Damiani |first4=Ernesto |chapter=Combining multi-probe histogram and order-statistics based LSH for scalable audio content retrieval |publisher=Proceedings of the 18th International Conference on Multimedia 2010 |pages=381–390 |ref=YY10|doi=10.1145/1873951.1874004 |year=2010 |isbn=9781605589336 |s2cid=9033525 }}</ref><ref name="YY09">{{cite book |last1=Yu |first1=Yi |title=Proceedings of the seventeen ACM international conference on Multimedia - MM '09 |last2=Crucianu |first2=Michel |last3=Oria |first3=Vincent |last4=Chen |first4=Lei |chapter=Local summarization and multi-level LSH for retrieving multi-variant audio tracks |publisher=Proceedings of the 17th International Conference on Multimedia 2009 |pages=341–350 |ref=YY09|doi=10.1145/1631272.1631320 |year=2009 |isbn=9781605586083 |s2cid=816862 }}</ref>
==Computation of audio chromagrams== There are many ways for converting an audio recording into a chromagram. For example, the conversion of an audio recording into a chroma representation (or chromagram) may be performed either by using short-time Fourier transforms in combination with binning strategies<ref name=BartschW05_chroma_IEEEMULTIMEDIA> {{cite journal |last1=Bartsch |first1=Mark A. |last2=Wakefield |first2=Gregory H. |title=Audio thumbnailing of popular music using chroma-based representations |journal=IEEE Transactions on Multimedia |volume=7 |number=1 |year=2005 |pages=96–104|doi=10.1109/TMM.2004.840597 |bibcode=2005ITMm....7...96B |citeseerx=10.1.1.379.3293 |s2cid=12559221 }} </ref><ref name=Gomez06_PhD> {{cite journal |last=Gómez |first=Emilia |title=Tonal Description of Music Audio Signals |journal=PhD Thesis, UPF Barcelona, Spain |year=2006}} </ref><ref name=Mueller15_Chapter3FMP_SPRINGER> {{cite book | last = Müller | first = Meinard | title = Music Synchronization. In Fundamentals of Music Processing, chapter 3, pages 115-166 | url = http://www.music-processing.de | publisher = Springer | year = 2015 | isbn = 978-3-319-21944-8 }} </ref> or by employing suitable multirate filter banks.<ref name=MuellerKC05_ChromaFeatures_ISMIR/> Furthermore, the properties of chroma features can be significantly changed by introducing suitable pre- and post-processing steps modifying spectral, temporal, and dynamical aspects. This leads to a large number of chroma variants, which may show a quite different behavior in the context of a specific music analysis scenario.<ref name=MuellerE11_ChromaToolbox_ISMIR> {{cite journal |last1=Müller |first1=Meinard |last2=Ewert |first2=Sebastian |title=Chroma Toolbox: MATLAB Implementations For Extracting Variants of Chroma-Based Audio Features |url = https://www.audiolabs-erlangen.de/content/05-fau/professor/00-mueller/03-publications/2011_MuellerEwert_ChromaToolbox_ISMIR.pdf |journal=Proceedings of the International Society for Music Information Retrieval Conference |year=2011 |pages=215–220}}</ref>
==See also== *Time-frequency analysis *Time-frequency analysis for music signal *Pitch (music) *Musical theory
==References== {{Reflist|30em}}
== External links == *[https://www.audiolabs-erlangen.de/resources/MIR/chromatoolbox Chroma Toolbox] Free MATLAB implementations of various chroma types of pitch-based and chroma-based audio features *[http://mtg.upf.edu/technologies/hpcp Harmonic Pitch Class Profile plugin]
Category:Music information retrieval Category:Music technology Category:Musicology Category:Time–frequency analysis