Subvocal recognition

{{Short description|Converting subvocalization to a digital output}} [[Image:subvocal speech recognition.jpg|thumb|300px|Electrodes used in subvocal speech recognition research at NASA's Ames Research Lab]]

'''Subvocal recognition''' ('''SVR''') is the process of taking [[subvocalization]] and converting the detected results to a digital output, aural or text-based.<ref>{{cite book|last1=Shirley|first1=John|title=New Taboos|publisher=PM Press|isbn=978-1-60486-871-5|url=https://books.google.com/books?id=phwsUNupNREC&pg=PT70|access-date=14 April 2017|language=en|date=2013-05-01}}</ref> A '''silent speech interface''' is a device that allows [[speech communication]] without using the sound made when people vocalize their [[speech sound]]s. It works by the computer identifying the [[phoneme]]s that an individual pronounces from nonauditory sources of information about their [[speech production|speech movement]]s. These are then used to recreate the [[speech]] using [[speech synthesis]].<ref>Denby B, Schultz T, Honda K, Hueber T, Gilbert J.M., Brumberg J.S. (2010). Silent speech interfaces. Speech Communication 52: 270–287. {{doi|10.1016/j.specom.2009.08.002}}</ref>

==Input methods== Silent speech interface systems have been created using [[ultrasound]] and optical camera input of [[tongue]] and [[lip]] movements.<ref name="Hueber"/> Electromagnetic devices are another technique for tracking tongue and lip movements.<ref> Wang, J., Samal, A., & Green, J. R. (2014). [https://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1231&context=cseconfwork Preliminary test of a real-time, interactive silent speech interface based on electromagnetic articulograph], the 5th ACL/ISCA Workshop on Speech and Language Processing for Assistive Technologies, Baltimore, MD, 38-45. </ref>

The detection of speech movements by [[electromyography]] of speech articulator muscles and the [[larynx]] is another technique.<ref>Jorgensen C, Dusan S. (2010). Speech interfaces based upon surface electromyography. Speech Communication, 52: 354–366. {{doi|10.1016/j.specom.2009.11.003}}</ref><ref>Schultz T, Wand M. (2010). Modeling Coarticulation in EMG-based Continuous Speech Recognition. Speech Communication, 52: 341-353. {{doi|10.1016/j.specom.2009.12.002}}</ref> Another source of information is the [[vocal tract]] resonance signals that get transmitted through [[bone conduction]] called non-audible murmurs.<ref>Hirahara T, Otani M, Shimizu S, Toda T, Nakamura K, Nakajima Y, Shikano K. (2010). Silent-speech enhancement using body-conducted vocal-tract resonance signals. Speech Communication, 52:301–313. {{doi|10.1016/j.specom.2009.12.001}}</ref>

They have also been created as a [[brain–computer interface]] using brain activity in the [[motor cortex]] obtained from [[Chronic electrode implants|intracortical microelectrode]]s.<ref>Brumberg J.S., Nieto-Castanon A, Kennedy P.R., Guenther F.H. (2010). Brain–computer interfaces for speech communication. Speech Communication 52:367–379. 2010 {{doi|10.1016/j.specom.2010.01.001}}</ref>

==Uses== Such devices are created as aids to those unable to create the sound [[phonation]] needed for audible speech such as after [[laryngectomy|laryngectomies]].<ref name="Deng">Deng Y., Patel R., Heaton J. T., Colby G., Gilmore L. D., Cabrera J., Roy S. H., De Luca C.J., Meltzner G. S.(2009). [https://web.archive.org/web/20190930010329/https://pdfs.semanticscholar.org/e39c/7a828a7b5523fa0d4d550d760bd38c23e907.pdf Disordered speech recognition using acoustic and sEMG signals]. In INTERSPEECH-2009, 644-647.</ref> Another use is for communication when speech is masked by [[background noise]] or distorted by [[self-contained breathing apparatus]]. A further practical use is where a need exists for silent communication, such as when privacy is required in a public place, or hands-free data silent transmission is needed during a [[military operation|military]] or security operation.<ref name="Hueber">Hueber T, Benaroya E-L, Chollet G, Denby B, Dreyfus G, Stone M. (2010). Development of a silent speech interface driven by ultrasound and optical images of the tongue and lips. Speech Communication, 52 288–300. {{doi|10.1016/j.specom.2009.11.004}}</ref><ref name="Deng2">Deng Y., Colby G., Heaton J. T., and Meltzner HG. S. (2012). Signal Processing Advances for the MUTE sEMG-Based Silent Speech Recognition System. Military Communication Conference, MILCOM 2012.</ref>

In 2002, the Japanese company [[NTT DoCoMo]] announced it had created a silent [[mobile phone]] using [[electromyography]] and imaging of lip movement. The company stated that "the spur to developing such a phone was ridding public places of noise," adding that, "the technology is also expected to help people who have permanently lost their voice."<ref>Fitzpatrick M. (2002). [https://www.newscientist.com/article/dn2122-lipreading-cellphone-silences-loudmouths.html Lip-reading cellphone silences loudmouths]. New Scientist.</ref> The feasibility of using silent speech interfaces for practical communication has since then been shown.<ref>Wand M, Schultz T. (2011). [https://scitepress.org/papers/2011/31697/31697.pdf Session-independent EMG-based Speech Recognition]. Proceedings of the 4th International Conference on Bio-inspired Systems and Signal Processing.</ref>

In 2019, [[Arnav Kapur]], a researcher from the [[Massachusetts Institute of Technology]], conducted a study known as AlterEgo. Its implementation of the silent speech interface enables direct communication between the human brain and external devices through stimulation of the speech muscles. By leveraging neural signals associated with speech and language, the AlterEgo system deciphers the user's intended words and translates them into text or commands without the need for audible speech.<ref>{{Cite web |title=Project Overview ‹ AlterEgo |url=https://www.media.mit.edu/projects/alterego/overview/ |access-date=2024-05-20 |website=MIT Media Lab}}</ref>

==Research and patents== With a grant from the U.S. Army, research into [[synthetic telepathy]] using subvocalization is taking place at the University of California, Irvine under lead scientist Mike D'Zmura.<ref>{{cite web | url=https://www.nbcnews.com/id/wbna27162401 | title=Army developing 'synthetic telepathy' | website=[[NBC News]] | date=13 October 2008 }}</ref>

[[NASA]]'s [[Ames Research Laboratory]] in [[Mountain View, California|Mountain View]], California, under the supervision of Charles Jorgensen is conducting subvocalization research.{{citation needed|date=April 2015}}

The Brain Computer Interface R&D program at [[Wadsworth Center]] under the [[New York State Department of Health]] has confirmed the existing ability to decipher consonants and vowels from imagined speech, which allows for brain-based communication using imagined speech,<ref>{{cite journal |doi=10.1088/1741-2560/8/4/046028 |pmid=21750369 |pmc=3772685 |title=Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans |journal=[[Journal of Neural Engineering]] |volume=8 |issue=4 |article-number=046028 |year=2011 |last1=Pei |first1=Xiaomei |last2=Barbour |first2=Dennis L |last3=Leuthardt |first3=Eric C |last4=Schalk |first4=Gerwin |bibcode=2011JNEng...8d6028P |author-link3=Eric Leuthardt}}</ref> however using EEGs instead of subvocalization techniques.

US Patents on silent communication technologies include: US Patent 6587729 "Apparatus for audibly communicating speech using the radio frequency hearing effect",<ref>{{US patent|6587729|Apparatus for audibly communicating speech using the radio frequency hearing effect}}</ref> US Patent 5159703 "Silent subliminal presentation system",<ref>{{US patent|5159703|Silent subliminal presentation system}}</ref> US Patent 6011991 "Communication system and method including brain wave analysis and/or use of brain activity",<ref>{{US patent|6011991|Communication system and method including brain wave analysis and/or use of brain activity}}</ref> US Patent 3951134 "Apparatus and method for remotely monitoring and altering brain waves".<ref>{{US patent|3951134|Apparatus and method for remotely monitoring and altering brain waves}}</ref> Latter two rely on brain wave analysis.

==In fiction== * The decoding of silent speech using a computer played an important role in [[Arthur C. Clarke]]'s story and [[Stanley Kubrick]]'s associated film ''[[2001: A Space Odyssey (film)| A Space Odyssey]]''. In this, [[HAL 9000]], a computer controlling spaceship [[Discovery One]], bound for Jupiter, discovers a plot to deactivate it by the mission astronauts [[David Bowman (Space Odyssey)|Dave Bowman]] and [[Frank Poole]] through [[lip reading]] their conversations.<ref>Clarke, Arthur C. (1972). The Lost Worlds of 2001. London: Sidgwick and Jackson. {{ISBN|0-283-97903-8}}.</ref> *In [[Orson Scott Card]]'s series (including ''[[Ender's Game|Ender's Game]]''), the artificial intelligence can be spoken to while the protagonist wears a movement sensor in his jaw, enabling him to converse with the AI without making noise. He also wears an ear implant. * In ''[[Speaker for the Dead]]'' and subsequent novels, author [[Orson Scott Card]] described an ear implant, called a "jewel", that allows subvocal communication with computer systems. * Author [[Robert J. Sawyer]] made use of subvocal recognition to allow silent commands to the cybernetic 'companion implants' used by the advanced [[Neanderthal]] characters in his ''[[Neanderthal Parallax]]'' trilogy of science fiction novels. * In ''[[Earth (Brin novel)|Earth]]'', [[David Brin]] depicts this technology and its uses as a normal gear in the near future. * In ''[[Down and Out in the Magic Kingdom]]'', [[Cory Doctorow]] has cellphone technology become silent through a cochlear implant and miking the throat to pick up subvocalization. * [[William Gibson]]'s [[Sprawl Trilogy|''Sprawl'' Trilogy]] frequently uses sub-vocalization systems in various devices. * In [[Kage Baker]]'s [[Kage Baker#Novels set in the Company universe|''Company'' novels]], the immortal [[cyborg]]s communicate subvocally. * In the [[Hugo Award]]-winning ''[[Hyperion Cantos]]'' by [[Dan Simmons]], the characters often use subvocalization to communicate. * In the [[Culture series|''Culture'' novels]] by [[Iain M. Banks]], more highly advanced species often communicate subvocally through their technology. * In ''[[Deus Ex: Human Revolution]]'' (2011), the protagonist is [[Human enhancement|augmented]] with a subvocalization implant for sending covert communications (and a corresponding [[cochlear implant]] for receiving covert communications). * In the tabletop RPG and video game series ''[[Shadowrun]]'', player characters can communicate via subvocal microphones in some instances. * In ''[[Paranoia (role-playing game)|Paranoia]]'', all citizens can speak to the computer via their "cerebral cortech" implants. *Alistair Reynolds ''Revelation Space'' trilogy frequently uses sub-vocalization systems in various devices.

==See also== * [[Automated Lip Reading]] * [[Applications of artificial intelligence]] * [[Electrolarynx]] * [[List of emerging technologies]] * [[Outline of artificial intelligence]] * [[Speech recognition]] * [[Silent speech interface]] * [[Throat microphone]] * [[Synthetic telepathy]]

==References== {{reflist}}

==Further reading== * {{cite news | first=John| last=Bluck| page=1 | title=NASA Press Release | date=March 17, 2004 | publisher=NASA| url=http://www.nasa.gov/centers/ames/news/releases/2004/04_18AR.html | archive-url=https://www3.nasa.gov/home/hqnews/2004/mar/HQ_04093_subvocal_speech.html | archive-date=January 1, 2024 }} * {{cite news | first=David| last=Armstrong| page=1 | title=The Silent Speaker | date=April 10, 2006 | work=Forbes| url=https://www.forbes.com/free_forbes/2006/0410/084.html | archive-url=https://web.archive.org/web/20060414150054/http://www.forbes.com/free_forbes/2006/0410/084.html | archive-date=April 14, 2006 }} * {{cite news | first=Tom| last=Simonite| page=1 | title=Thinking of words can guide your wheelchair | date=September 6, 2007 | publisher=New Scientist | url=https://www.newscientist.com/article/dn12602-thinking-of-words-can-guide-your-wheelchair-.html }}

== External links == * [http://www.nasa.gov/centers/ames NASA Ames Center]

[[Category:Computational linguistics]] [[Category:Fictional technology]] [[Category:Human communication]] [[Category:Memory]] [[Category:Reading (process)]] [[Category:Speech recognition]] [[Category:Vocal skills]]