Speech recognition in Linux

Speech recognition in Linux

There is currently no open-source equivalent of proprietary speech recognition software (e.g. Nuances Dragon NaturallySpeaking or Windows Speech Recognition) for Linux. However, there are several incomplete, open-source projects and solutions that could be used to attain some elements of speech recognition in the free operating system. It is also possible to use Windows speech recognition software under Linux.

Native Linux speech recognition

History

In the late 1990s, a Linux version of ViaVoice (created by IBM) was made available to users for no charge. However, the free SDK was later removed by the developer in 2002.

Current development status

Recently, there has been a push to get a high-quality native Linux speech recognition engine developed. As a result, numerous projects dedicated to creating Linux speech recognition solutions (that are equivalent to current Windows solutions) were established. One major hurdle is the compilation of a speech corpus to enable production of acoustic models. In response, VoxForge, which aims to collect transcribed speech for the use with free and open-source speech recognition engines under the GPL license, was set up.

Ubuntu is currently gathering ideas for implementing speech recognition. [ [https://wiki.ubuntu.com/SpeechRecognition SpeechRecognition - Ubuntu Wiki ] ] .

Solutions

The following is a list of current projects dedicated to implementing speech recognition in Linux, as well as major (though mostly incomplete) native solutions that are available as of March 2008:

*VoxForge
*Julius
*CMU Sphinx
*HTK (copyrighted by Microsoft, though source code is available for personal use)
* [http://xvoice.sourceforge.net/ Xvoice] (requires ViaVoice to function)
* [http://freespeech.sourceforge.net/ Open Mind Speech]
* [http://live.gnome.org/GnomeVoiceControl GnomeVoiceControl]
* [http://simon-listens.org/ Simon] (This project aims at helping blind people; requires Julius)

It is possible, though complicated, for advanced developers to create Linux speech recognition software by using existing packages derived from open-source projects.

Voice control and keyboard shortcuts

Speech recognition usually refers to software that attempts to distinguish thousands of words in a human language. Voice control may refer to software used for sending operational commands to a computer or appliance. Voice control typically requires a much smaller vocabulary and thus is much easier to implement.

Simple software combined with keyboard shortcuts, have the earliest potential for practically accurate voice control in Linux. Keyboard shortcuts can be used to control many Linux programs. GNOME and KDE have extensive and easily reconfigurable keyboard shortcuts for most tasks. Mozilla Firefox has an Add-on called [https://addons.mozilla.org/en-US/firefox/addon/879 MouselessBrowsing] , allowing links and input boxes to be quickly selected from the keyboard.

Running Windows speech recognition software with Linux

Using a compatiblity layer

It is possible to use programs such as Dragon NaturallySpeaking 9 in Linux by utilizing Wine, though some problems will arise [ [http://appdb.winehq.org/objectManager.php?sClass=version&iId=5402 Dragon NaturallySpeaking 9 - Wine Application Database] ] .

Using virtualized Windows

Using no-cost virtualization software, it is possible to run Windows and NaturallySpeaking under Linux [ [http://scratchpad.wikia.com/wiki/Speech_recognition_on_free_operating_systems#Running_Windows.2FDNS_in_a_virtual_machine Running Windows/DNS in a virtual machine - Lumeniki] ] . VMware Server or VirtualBox support copy and paste to/from a virtual machine, making dictated text easily transferable to/from the virtual machine. Note that problems (such as sound input errors [ [http://scratchpad.wikia.com/wiki/Speech_recognition_on_free_operating_systems#Sound_input_problems Sound input problems with DNS in a virtual machine - Lumeniki] ] ) may occur.

WinDictator

[http://foss.eepatents.com/trac/WinDictator/wiki WinDictator] is able to send keystrokes from Windows dictation software (running on a real or virtual Windows machine) to Linux, but installing it may require advanced skills [ [http://scratchpad.wikia.com/wiki/Speech_recognition_on_free_operating_systems#WinDictator WinDictator - Lumeniki] ] . This would offer more functionality than using plain virtualization (or a compatibility layer) if it allows keyboard shortcuts.

Combining native software with Windows software

If native Linux voice control software could be used simultaneously with Windows speech recognition software (running under Linux) much of the same functionality offered by Windows speech recognition software would be available to Linux.

See also

*Speech recognition
*List of speech recognition software

References

External links

* [http://linux-sound.org/speech.html Speech Synthesis & Analysis Software]
* [http://raphaelnunes.wordpress.com/2007/06/16/gnome-voice-control-demonstration/ Gnome Voice Control (an incomplete speech recognition solution for GNOME) - Demonstration]
* [http://tldp.org/HOWTO/Speech-Recognition-HOWTO/software.html Speech Recognition Software - list of speech recognition projects and solutions in Linux]


Wikimedia Foundation. 2010.

Игры ⚽ Нужна курсовая?

Look at other dictionaries:

  • Speech recognition — For the human linguistic concept, see Speech perception. The display of the Speech Recognition screensaver on a PC, in which the character responds to questions, e.g. Where are you? or statements, e.g. Hello. Speech recognition (also known as… …   Wikipedia

  • List of speech recognition software — Modern Speech recognition software enables a single computer user to speak text and/or commands to the computer, largely, but not entirely, bypassing the use of the keyboard and mouse interface.The idea has been portrayed in science fiction for… …   Wikipedia

  • Speech synthesis — Stephen Hawking is one of the most famous people using speech synthesis to communicate Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented… …   Wikipedia

  • Covox Speech Thing — One widely used variant The Covox Speech Thing (also known as Covox plug) was an external audio device attached to the computer to output digital sound. It was composed of the most primitive 8 bit DAC using a resistor ladder and an analogue… …   Wikipedia

  • Computers and Information Systems — ▪ 2009 Introduction Smartphone: The New Computer.       The market for the smartphone in reality a handheld computer for Web browsing, e mail, music, and video that was integrated with a cellular telephone continued to grow in 2008. According to… …   Universalium

  • TomTom — For other uses, see Tom tom (disambiguation). TomTom NV Type Naamloze vennootschap Traded as Euronext:  …   Wikipedia

  • List of free and open source software packages — This article is about software free to be modified and distributed. For examples of software free in the monetary sense, see List of freeware. This is a list of free and open source software packages: computer software licensed under free… …   Wikipedia

  • List of open source software packages — This is a list of open source software packages: computer software licensed under an open source license. Software that fits the Free software definition may be more appropriately called free software; the GNU project in particular objects to… …   Wikipedia

  • Comparison of web browsers — September 2011, web browser usage share. Source: Median values from summary table …   Wikipedia

  • Computer keyboard — A key being pressed on a computer keyboard. In computing, a keyboard is a typewriter style keyboard, which uses an arrangement of buttons or keys, to act as mechanical levers or electronic switches. Following the decline of punch cards and paper… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”