Viseme

{{Short description|Any of several speech sounds that look the same, for example when lip reading}} {{more footnotes|date=January 2023}} thumb|upright=1.35|Vowel lip shapes in a 1919 lip reading manual {{IPA notice}} A '''viseme''' is any of several speech sounds that look the same, for example when lip reading.<ref>{{cite journal |last1=Fisher |first1=Cletus G. |title=Confusions Among Visually Perceived Consonants |journal=Journal of Speech and Hearing Research |date=1 December 1968 |volume=11 |issue=4 |pages=796–804 |doi=10.1044/jshr.1104.796 |pmid=5719234 |url=https://pubs.asha.org/doi/10.1044/jshr.1104.796|url-access=subscription }}</ref>

Visemes and phonemes do not share a one-to-one correspondence. Often several phonemes correspond to a single viseme, as several phonemes look the same on the face when produced, such as {{IPA|/k, ɡ, ŋ/}}; as well as {{IPA|/t, d, n, l/}} and {{IPA|/p, b, m/}}). Thus words such as ''pet, bell,'' and ''men'' are difficult for lip-readers to distinguish, as all look like alike. On one account, visemes offer (phonetic) information about place of articulation, while manner of articulation requires auditory input.<ref name="Summerfield92">{{cite journal |last1=Summerfield |first1=Quentin |title=Lipreading and audio-visual speech perception |journal= Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences|date=29 January 1992 |volume=335 |issue=1273 |pages=71–78 |doi=10.1098/rstb.1992.0009 |url=https://royalsocietypublishing.org/doi/10.1098/rstb.1992.0009 |eissn=1471-2970 |issn=0962-8436 |pmid=1348140|url-access=subscription }}</ref>

However, there may be differences in timing and duration during natural speech in terms of the visual "signature" of a given gesture that cannot be captured by simply concatenating (stilled) images of each of the mouth patterns in sequence.<ref name="CalvertCampbell03">{{cite journal |last1=Calvert |first1=Gemma A. |last2=Campbell |first2=Ruth |title=Reading Speech from Still and Moving Faces: The Neural Substrates of Visible Speech |journal=Journal of Cognitive Neuroscience |date=1 January 2003 |volume=15 |issue=1 |pages=57–70 |doi=10.1162/089892903321107828 |pmid=12590843|s2cid=14153329 }}</ref> Conversely, some sounds which are hard to distinguish acoustically are clearly distinguished by the face. For example, in spoken English {{IPA|/l/}} and {{IPA|/r/}} can often sound quite similar (especially in clusters, such as 'grass' vs. 'glass'), yet the visual information can disambiguate. Some linguists have argued that speech is best understood as bimodal (aural and visual), and comprehension can be compromised if one of these two domains is absent.<ref>{{cite journal |last1=McGurk |first1=Harry |last2=MacDonald |first2=John |author1-link=Harry McGurk |title=Hearing lips and seeing voices |journal=Nature |date=23 December 1976 |volume=264 |issue=5588 |pages=746–748 |doi=10.1038/264746a0 |pmid=1012311 |bibcode=1976Natur.264..746M |url=https://doi.org/10.1038/264746a0|url-access=subscription }}</ref>

Visemes can often be humorous, as in the phrase "elephant juice", which when lip-read appears identical to "I love you".

Applications for the study of visemes include speech processing, speech recognition, and computer facial animation.

==See also== * {{annotated link|Seme (semantics)|Seme}} * {{annotated link|Chroneme}}

==References== {{Reflist}}

==Further reading== * {{cite journal |last1=Chen |first1=Tsuhan |last2=Rao |first2=R. R. |date=31 May 1998 |title=Audio-visual integration in multimodal communication |journal=Proceedings of the IEEE |volume=86 |issue=5 |publisher=IEEE |pages= 837–852 |doi=10.1109/5.664274 |bibcode=1998IEEEP..86..837T |eissn=1558-2256 |issn=0018-9219 |url=http://scholarbank.nus.edu.sg/handle/10635/146400 }} * {{cite journal |last1=Chen |first1=Tsuhan |date=31 January 2001 |title=Audiovisual speech processing |journal=IEEE Signal Processing Magazine |volume=18 |issue=1 |publisher=IEEE |pages= 9–21 |doi=10.1109/79.911195 |bibcode=2001ISPM...18....9C |eissn=1558-0792 |issn=1053-5888}} * {{cite conference |last1=Lucey |first1=Patrick |last2=Martin |first2=Terrence |last3=Sridharan |first3=Sridha |date=8–10 December 2004 |title=Confusability of Phonemes Grouped According to their Viseme Classes in Noisy Environments |url=http://www.assta.org/sst/2004/proceedings/papers/sst2004-377.pdf |conference=10th Australian International Conference on Speech Science & Technology |publication-place=Sydney |publisher=Macquarie University |pages=265��270 |url-status=dead |archive-url=https://web.archive.org/web/20170705110851/http://www.assta.org/sst/2004/proceedings/papers/sst2004-377.pdf |archive-format= |archive-date=5 July 2017}}

Category:Facial expressions Category:Linguistic units Category:Phonology