TOP > Research > Department of Systems and Social Informatics > Department of Media Science > Speech and Image Science Group > TAKEDA, Kazuya

Comprehensive List of Researchers "Information Knowledge"

Department of Media Science

TAKEDA, Kazuya
Speech and Image Science Group
Dr. of Engineering
Research Field
Speech / Text and human behavior signal processing

Current Research

Informatics of Sound
(1) Spatial Audio Signal Processing
Spatial impression is one of the most important types of information in audio signals, and not only transmitting, but also generating 3D sound fields, has been a fundamental problem in the audio signal processing. The goal of our group's Selective Listening Point (SLP) audio system project is to realize an audio system that can generate a sound field at a given location using sounds captured through distant microphones. The current approach, i.e., combining the blind separation of acoustic signals under real conditions and controlling Head Related acoustic Transfer Functions (HRTFs), can generate natural directional sounds for anechoic space.
(2) Speech Signal Processing
As speech is the most natural way for humans to communicate, spoken language interfaces are considered to be the best modality for a wide range of information systems. Particularly, in a vehicular environment, where hands and/or eyes are not free for operating information devices, the use of speech recognition technology is indispensable. Since current speech recognition technology fully utilizes statistical approaches, such as hidden Markov Models of speech and the N-gram model of word sequences, the mismatch between training and testing conditions cause serious degradation of system performance. Therefore, noise reduction technology is a very important issue for speech recognition to succeed in real environments.
Our group is studying speech enhancement technologies based on the multiple-regression of spatially distributed microphones for in-car applications. We have found that the method is very effective for low SNRs, i.e. -10 dB, and highly non-stationary environments. By extending the statistical modeling of the noise contamination process, we also found an effective noise reduction method for moderate SNR conditions. These methods' effectiveness is confirmed through speech recognition experiments using the standard corpus. In addition to the signal processing aspect of speech processing, we are investigating language modeling, field tests of spoken dialogue systems, and discrimination of spoken and song speech.
(3) Human Behavior Signal Processing
The recent progress in sensor and communication technologies is making possible long-term human sensing with wearable devices. Here, the target area of signal processing covers a broad range of human-observation signals, highlighting the growing importance of human-behavior signal processing (HBSP).
As a pioneer group in the field of HBSP, we have been working on modeling driving behaviors. We found that the statistical phase space of driving, i.e. a joint probability of the head distance from the preceding car and the speed of the car generally represents the driving characteristics, and noticed that drivers' individuality can be extracted through a Gaussian Mixture Model (GMM) of the phase space. It was also found that the cepstrum analysis is an effective method for deconvoluting the driving action into human dynamics and the command sequence. Showing that approximately 80% drivers can be correctly identified by a cepstrum feature with dynamic parameters, we confirmed the effectiveness of signal modeling of human behavior.
Currently, we are applying HBSP to the generation of driving behavior.
Figure : Informatics of sound

Figure : Informatics of sound


  • Kazuya Takeda received his B.E.E., M.E.E., and Doctor of Engineering degrees from Nagoya University in 1983, 1985, and 1994, respectively.
  • In 1986, he joined ATR.
  • In 1989, he moved to KDD R & D Laboratories and participated in a project to construct a voice-activated telephone extension system.
  • Since 1995, he has been working at Nagoya University.

Academic Societies

  • ASJ
  • IPSJ
  • IEEE


  1. Kazuya Takeda, Hakan Erdogan, John H. L. Hansen, Huseyin Abut (Eds.), In-Vehicle Corpus and Signal Processing for Driver Behavior, Springer-Verlag (2008)
  2. Lucas Malta, Chiyomi Miyajima, Kazuya Takeda, A Study of Driver Behavior Under Potential Threats in Vehicle Traffic, IEEE Trans. on ITS, vol.10, no.2, pp.201-210 (2009.6)
  3. Kenta Niwa, Takanori Nishino, Kazuya Takeda, Selective Listening Point Audio based on Blind Signal Separation and Stereophonic Technology, IEICE Trans. on Infomation and Systems, vol.E92-D, no.3, pp.469-476 (2009.3)