Crystal TTVS engine is a real-time audio-visual Multilingual speech synthesizer with a 3D expressive avatar.
Crystal TTVS engine is a real-time audio-visual Multilingual (Mandarin, Cantonese and English) speech synthesizer with a 3D expressive avatar.
The avatar model is parameterized according to the MPEG-4 facial animation standard, which offers a compact set of facial animation parameters (FAPs) and feature points (FPs) to enable realization of 20 visemes and 7 facial expressions. A set of TTVS engines (including Mandarin, Cantoense and English) converts the input phoneme sequence with time information into visemes and then further into FAP sequence. The 3D avatar animation is then rendered according to the FAP sequence by the Xface open source toolkit.
Please use the following papers for reference to this project:
Zhiyong WU, Shen ZHANG, Lianhong CAI, Helen MENG, "Real-time Synthesis of Chinese Visual Speech and Facial Expressions using MPEG-4 FAP Features in a Three-dimensional Avatar," [in] International Conference on Spoken Language Processing (Interspeech2006, ICSLP), pp. 1802-1805. Pittsburgh, USA, 17-21 September 2006.
Shen ZHANG, Zhiyong WU, Helen M. MENG, Lianhong CAI, "Facial Expression Synthesis Using PAD Emotional Parameters for a Chinese Expressive Avatar," [in] International Conference on Affective Computing and Intelligent Interaction (ACII2007), pp. 24-35. Lisbon, Portugal, 12-14 September 2007.
Shen ZHANG, Zhiyong WU, Helen M. MENG, Lianhong CAI, "Head Movement Synthesis based on Semantic and Prosodic Features for a Chinese Expressive Avatar," [in] International Conference on Acoustics, Speech and Signal Processing (ICASSP2007), pp. 837-840. Hawaii, USA, April 15-20 2007.
The engine supports TTVS in (but not limited to) the following languages: Chinese Mandarin, Cantonese, and English. You can actually implement your TTVS engine by overriding the CSTHead::FapTTVS (/TTVS/FapTTVS.h/.cpp), just like CSTHead::FapMandarin, CSTHead::FapCantonese, or CSTHead::FapEnglish.
It is also possible to run the TTVS engine on different platforms, as the following figures illustrate.
Six basic expressions of the 3D avatar:
Head movement on the 3D avatar:
Compile TinyXML
Complie Xface
Compile TTVS