      2014年1月17日,应谢磊教授邀请,上海交通大学计算机系俞凯教授对西北工业大学计算机学院和陕西省语音与图像处理重点实验室进行了学术访问。上午10:30分,俞凯教授在计算机学院105学术报告厅做了题为“Intelligent Speech Technology: Concept and State-of-the-art”(智能语音技术:概念与先进技术)的学术报告。报告会由计算机学院谢磊教授主持,40余名师生参加了报告会。俞凯教授毕业于英国剑桥大学,从事语音研究多年,参加了多项国际重要语音交互研究项目,被上海交通大学特聘为教授。报告会上,俞教授为大家综述了智能语音技术,包括语音识别、语音合成、对话系统、音频分析等技术等。报告会后,俞凯教授与师生展开了讨论,同学们积极提问,受益匪浅。


      报告人简介:Kai Yu is a research professor at the Computer Science and Technology Department of Shanghai Jiao Tong University. He obtained B. Eng in automation in 1999 and M. Sc in pattern recognition and intelligent systems in 2002 from Tsinghua University, China. He then joined the machine intelligence lab at Cambridge University Engineering Department and got his Ph.D. in large vocabulary continuous speech recognition in 2006. He has been working as a senior research associate at Cambridge University until he joined Shanghai Jiao Tong University (SJTU). His research interest covers a wide range of areas of speech and language processing, including speech recognition, statistical speech synthesis, spoken language understanding and end-to-end dialogue systems. He has been actively involved in system design and implementation of large-scale speech recognition systems and cognitive end-to-end spoken dialogue systems, including the Cambridge LVCSR systems which defined the state-of-the-art of speech recognition and the next-generation end-to-end spoken dialogue systems. He is a senior member of the IEEE, a member of the ISCA and the IET. He has served as the area chairs of speech recognition/processing for INTERSPEECH 2009, EUSIPCO 2011 and INTERSPEECH 2014. He was selected into the “1000 Overseas Talent Plan (Young Talent)” by Chinese central government in 2012. He was also selected into the Programme for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning.

      报告摘要: Speech is one of the most efficient and informative way for human communication. The recent boom of mobile internet raises great interest in speech-based human machine interface, from both academia and industry. This talk will give an introduction of the speech technology in SJTU: speech recognition, speech synthesis, dialogue system and rich audio analysis. Basic theories and algorithms will be reviewed in a casual way and the examples of applications will be presented with the underlying theories. The implementation of the state-of-the-art large scale systems of speech recognition, synthesis and spoken dialogue system will also be introduced to emphasise the engineering difficulty in intelligent speech technology. Hopefully, the audience can capture a big picture of the complexity, relevant theory and techniques as well as the usefulness of the applications. After the general overview, discriminative mapping transform, an interesting framework of discriminative unsupervised adaptation and adaptive training will be introduced. It is a novel way to combine the advantages of robust maximum likelihood estimation and powerful discriminative parameter estimation.




