CereProc

CereProc
CereProc
CereProcLogo 2.png
Developer(s) CereProc Ltd.  UK
Initial release 2006
Written in C/Python
Operating system Cross-platform
Available in English / German / French / Castillian Spanish / Italian / Catalan / Romanian
Type Text-To-Speech
License Commercial
Website www.cereproc.com

CereProc is a speech synthesis company based in Edinburgh, Scotland, founded in 2005. The company specialises in creating natural and expressive-sounding text to speech voices, synthesis voices with regional accents, and in voice cloning.

Contents

Voice building technology

CereProc creates voices using two different voice-building technologies: unit selection synthesis and HTS.

CereProc's unit selection voices are built from large databases of recorded speech. During database creation, each recorded utterance is segmented into some or all of the following: individual phones, syllables, morphemes, words, phrases, and sentences. The division into segments is done using a specially modified speech recognizer.[1] An index of the units in the speech database is then created based on the segmentation and acoustic parameters like the fundamental frequency (pitch), duration, position in the syllable, and neighbouring phones. At runtime, the desired target utterance is created by determining the best chain of candidate units from the database (unit selection). Unit selection provides the greatest naturalness, because it applies digital signal processing (DSP) to the recorded speech only at concatenation points. DSP often makes recorded speech sound less natural.

CereProc's HTS voices produce speech synthesis based on hidden Markov models (HMMs). In this system, the frequency spectrum (vocal tract), fundamental frequency (vocal source), and duration (prosody) of speech are modelled simultaneously by HMMs. Speech waveforms are generated from HMMs themselves based on the maximum likelihood criterion.[2] Critically, HTS voices can be built from significantly less recorded speech than unit selection voices and have a much smaller footprint when installed.

Voices and languages

CereProc has fourteen generally available voices that speak five languages in a number of different regional accents:

American English: Katherine, Adam
British English: Sarah, William
Scottish English: Heather, Kirsty, Stuart
West Midlands English: Sue
French: Suzanne
Castillian Spanish: Sara
Italian: Laura
German: Gudrun, Alex
Austrian German: Leopold
French-accented English: Nicole

In addition, the company has developed a number of celebrity voices that are not generally available to the public. These include George W. Bush, Barack Obama and Arnold Schwarzenegger.[3]

Voice cloning

In 2009, film critic Roger Ebert employed CereProc to create a synthetic version of his voice. Ebert has lost the power of speech following surgery to treat thyroid cancer. CereProc mined tapes and DVD commentaries featuring Ebert's voice to create a text-to-speech voice that sounded more like his own.[4] Roger Ebert used the voice in his March 2, 2010 appearance appearance on The Oprah Winfrey Show.

CereProc voice cloning technology is currently being used in the UK by MND sufferers, to create synthesis voices before they lose the power of speech. This process was featured in a BBC Radio 4 documentary, Giving the Critic Back His Voice, broadcast in August 2010.[5]

System compatibility

CereProc voices can be deployed on different operating systems and on different types of devices. CereProc desktop voices are compatible with Microsoft Windows and Apple Mac OSX. They install as system voices and are able to be used by other speech-enabled applications. CereProc's client/server system cServer, aimed principally at the corporate IVR market, can be run on Windows and Linux. CereProc Mobile voices can be deployed on Android and Apple iOS.

See also

References

  1. ^ Alan W. Black, Perfect synthesis for all of the people all of the time. IEEE TTS Workshop 2002.
  2. ^ The HMM-based Speech Synthesis System, http://hts.sp.nitech.ac.jp/
  3. ^ CereProc Voices
  4. ^ Roger Ebert: The Essential Man "Esquire", February 16, 2010. Accessed: 9-21-2011
  5. ^ "Giving the Critic Back His Voice". BBC Radio Scotland Programmes. Retrieved October 26, 2011.

External links


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Speech synthesis — Stephen Hawking is one of the most famous people using speech synthesis to communicate Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented… …   Wikipedia

  • Sintetizador del habla — Uno o varios wikipedistas están trabajando actualmente en este artículo o sección. Es posible que a causa de ello haya lagunas de contenido o deficiencias de formato. Si quieres, puedes ayudar y editar, pero por favor: antes de realizar… …   Wikipedia Español

  • Microsoft Agent — Microsoft provides examples on its website for the use of Agent. Microsoft Agent is a technology developed by Microsoft which employs animated characters, text to speech engines, and speech recognition software to enhance interaction with… …   Wikipedia

  • Currah — was a British computer peripheral manufacturer, famous mainly for the speech synthesis cartridges it designed for the ZX Spectrum, Commodore 64, and other 8 bit home computers of the 1980s. Contents 1 Currah μSource for the ZX Spectrum 2 Currah… …   Wikipedia

  • Microsoft Narrator — A component of Microsoft Windows Screenshot of Microsoft Narrator in …   Wikipedia

  • Vocaloid — 2 Editor (English version) Developer(s) …   Wikipedia

  • Microsoft Speech Server — The Microsoft Speech Server is a product from Microsoft designed to allow the authoring and deployment of IVR applications incorporating Speech Recognition, Speech Synthesis and DTMF. The first version of the server was released in 2004 as… …   Wikipedia

  • MBROLA — is an algorithm for speech synthesis, and software which is distributed at no financial cost but in binary form only, and a worldwide collaborative project. The MBROLA project web page provides diphone databases for a large number of spoken… …   Wikipedia

  • DialogOS — The programming environment of DialogOS Developer(s) CLT Sprachtechnologie …   Wikipedia

  • Sprachsynthese — Unter Sprachsynthese versteht man die künstliche Erzeugung der menschlichen Sprechstimme (fälschlicherweise wird es oft auch als Synonym für Vorleseautomat oder Text to Speech System (TTS) verwendet). Grundsätzlich lassen sich zwei Ansätze zur… …   Deutsch Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”