CLEAR - Centre for Law Enforcement Audio Research

Personnel: Mike Brookes, Patrick Naylor, Nikolay Gaubitch, Dushyant Sharma , Mark Huckvale(UCL) and Gaston Hilkhuysen(UCL)
For more information see: http://www.clear-labs.com/
 
Glottal-Synchronous Speech Processing

Personnel: Mark Thomas, Jon Gudnason, Patrick Naylor
Glottal-synchronous speech processing is a field that exploits the quasi-periodicity of voiced speech to enhance speech processing applications. The project is divided into three work packages 1. Development of algorithms for periodicity measurement by detecting glottal closure and opening instants from both speech and intrusive data, 2. Implementation of glottal-synchronous methods in existing applications such as dereverberation, time-scale modification and compression, 3. Development of data-driven models of the glottal waveform for use in state-of-the-art applications such as artificial bandwidth extension and sub-kilobit compression algorithms.
 
Detection of Unconvertible Voice-mail Messages for Spinvox

Personnel: Patrick Naylor, Jon Gudnason
Spinvox provides a voice-mail to text-message conversion service for telephone users. The speech quality of such messages is highly variable. This project aims to automatically detect messages that are of such low quality that they are unconvertible to text. The solution is based developing the minimum statistics noise estimation for voice-mail speech and a measure of goodness for speech. The pattern classification of these measures is implemented using Gaussian Mixture Models and the their temporal dependency modelled using Hidden Markov Models.
 
Enhancement of Reverberant Speech for Telecommunications Applications

Personnel: Patrick Naylor, Nikolay Gaubitch, Andy Khong, Md. Kamrul Hasan
This project deals with the problem of room reverberation occurring when a speaking person is situated at a distance from the microphones. The reverberation causes the observed speech signals to be degraded in perceptual quality. We study the possibility of attenuating the effects of reverberation using two principle approaches: i) LPC residual enhancement and ii) approximate blind acoustic channel identification and inversion.
 
A Gradient-Based Approach to Unsupervised Signal Separation using Signal Properties

Personnel: Uttachai Manmontri, Patrick A. Naylor
This project involves the problem of unsupervised signal separation. An approach to the unsupervised signal separation problem using some fundamental properties of speech signals is considered in two different frameworks: blind signal separation (BSS) where all source signals are separated simultaneously and blind signal extraction (BSE) where only a subset of source signals or the signal of interest is extracted. The project includes the development of gradient-based algorithms, their convergence and performance analysis.
 
Closed phase detection in voiced speech and voice source cepstrum coefficients

Personnel: Mike Brookes, Patrick Naylor, Jon Gudnason
This project concentrates on the analysis of voiced speech, identification of glottal closure instants, and closed phase processing for speaker recognition. Glottal closure instants are used to identify closed phase portion of voiced speech. Voice source cepstrum coefficients are extracted by combining multi-cycle closed phase analysis and mel-frequency cepstra.
 
Blind Channel Identification in the Presence of Common Zeros

Personnel: Patrick Naylor, Xiang (Shawn) Lin
Blind Channel Identification (BCI) is one of the most important techniques for speech dereverberation. The objective is to recover the source signal through the observation of the output signal by estimating the channels. Adaptive BCI for speech dereverberation has been extensively investigated but most existing techniques rely on the Multi-channel Identifiability Conditions, which states that multi-channels must be co-prime, i.e., they do not share any common roots. The presence of common zeros largely degrades the performance of BCI algorithms. However, this problem is still not addressed mostly due to the computational complexity of factoring high order polynomials, such as acoustic channels. The objective of this research is to: (i) formulate and investigate problems associated with the identification of multi-channels with common zeros in SIMO acoustic system; (ii) develop feasible and practical algorithms to avoid performance degradation caused by common zeros; (iii) extend and apply the work to MIMO systems. Several approaches have been formulated to overcome the common zeros problem in blind system identification, such as (i) Channel decomposition scheme, (ii) Selective-tap updating scheme, (iii)Z-plane mapping scheme, (iv) Super-Resolution scheme, (v) Hybrid method.
 
Adaptive algorithms employing tap selection for single channel and stereophonic acoustic echo cancellation

Personnel: Andy Khong, Patrick Naylor
A class of selective tap algorithms is developed and their steady-state misalignment is analyzed for acoustic echo cancellation. These algorithms update a subset of filter coefficients at each sample iteration and are originally developed for complexity reduction. A novel approach to reduce interchannel coherence based on tap selection for stereophonic acoustic echo cancellation is introduced. The proposed exclusive-maximum (XM) tap-selection algorithm jointly reduce interchannel coherence and degradation in convergence performance due to tap selection. This is achieved by selecting exclusive filter coefficients corresponding to the maximum tap input energies of the two channels.
 
De-reverberation: Acoustic state modelling and translation

Personnel: Jimi Wen , Patrick Naylor
This project aims to develop standard novel measure for the perceived reverberation, and investigates some de-reverberation speech enhancement methods using acoustic modelling. We aim to develop de-reverberation algorithms that translate audio or speech of different acoustic states to another and are evaluated and improved using the perceptual measures developed.
 
Multisensor Acoustic Data Fusion for Enhanced Voice-based Human-Machine Interfaces

Personnel: Rehan Ahmad , Patrick Naylor
Research is being carried out, as part of this project, on acoustic dereverberation systems. These involve identification of reverberant channels such that those could be inverted and utilized to cancel the effect of reverberation. Hence research essentially focuses upon multichannel blind channel identification (BCI) methods.
 
Subspace-based speech enhancement using eigen-temporal and eigen-spectral models

Personnel: Vinesh Bhunjun, Mike Brookes
This project investigates noise estimation in the eigenspectral domain for single-channel speech enhancement. It involves exploiting the differences in the variation of speech and noise in time and in their representation in the transform domain.
 
Distribution based classification for speech/speaker recognition using Gaussian Mixture Models

Personnel: Jon Gudnason, Mike Brookes
The difference between the distribution of training and test data is studied for speech and speaker recognition. The aim is to apply the appropriate model to the extracted coefficients and improve recognition quality and search time.
 

Last updated: September 18, 2009