| Mike Brookes, Patrick Naylor, Nikolay Gaubitch, Dushyant Sharma , Mark Huckvale(UCL) and Gaston Hilkhuysen(UCL) |
| For more information see: http://www.clear-labs.com/
|
|
| |
| Mark Thomas, Jon Gudnason, Patrick Naylor |
| Glottal-synchronous speech processing is a field that exploits the quasi-periodicity of voiced speech to enhance speech processing applications. The project is divided into three work packages 1. Development of algorithms for periodicity measurement by detecting glottal closure and opening instants from both speech and intrusive data, 2. Implementation of glottal-synchronous methods in existing applications such as dereverberation, time-scale modification and compression, 3. Development of data-driven models of the glottal waveform for use in state-of-the-art applications such as artificial bandwidth extension and sub-kilobit compression algorithms. |
|
| |
| Patrick Naylor, Jon Gudnason |
|
Spinvox provides a voice-mail to text-message conversion service for
telephone users. The speech quality of such messages is highly variable.
This project aims to automatically detect messages that are of such low
quality that they are unconvertible to text. The solution is based
developing the minimum statistics noise estimation for voice-mail speech
and a measure of goodness for speech. The pattern classification of
these measures is implemented using Gaussian Mixture Models and the
their temporal dependency modelled using Hidden Markov Models. |
|
| |
| Patrick Naylor, Nikolay Gaubitch, Andy Khong, Md. Kamrul Hasan |
| This project deals with the problem of room reverberation occurring when a speaking person is situated at a distance from the microphones. The reverberation causes the observed speech signals to be degraded in perceptual quality. We study the possibility of attenuating the effects of reverberation using two principle approaches: i) LPC residual enhancement and ii) approximate blind acoustic channel identification and inversion. |
|
| |
| Uttachai Manmontri, Patrick A. Naylor |
|
This project involves the problem of unsupervised signal separation.
An approach to the unsupervised signal separation problem using some
fundamental properties of speech signals is considered in two different frameworks:
blind signal separation (BSS) where all source signals are separated simultaneously
and blind signal extraction (BSE) where only a subset of source signals or the
signal of interest is extracted. The project includes the development of gradient-based
algorithms, their convergence and performance analysis. |
|
| |
| Mike Brookes, Patrick Naylor, Jon Gudnason |
|
This project concentrates on the analysis of voiced speech, identification of glottal closure instants, and closed phase processing for speaker recognition. Glottal closure instants are used to identify closed phase portion of voiced speech. Voice source cepstrum coefficients are extracted by combining multi-cycle closed phase analysis and mel-frequency cepstra. |
|
| |
| Patrick Naylor, Xiang (Shawn) Lin |
|
Blind Channel Identification (BCI) is one of the most important techniques for speech dereverberation.
The objective is to recover the source signal through the observation of the
output signal by estimating the channels. Adaptive BCI for speech dereverberation
has been extensively investigated but most existing techniques rely on the Multi-channel
Identifiability Conditions, which states that multi-channels must be co-prime,
i.e., they do not share any common roots. The presence of common zeros largely
degrades the performance of BCI algorithms. However, this problem is still not addressed
mostly due to the computational complexity of factoring high order polynomials, such as acoustic channels.
The objective of this research is to: (i) formulate and investigate problems associated with the
identification of multi-channels with common zeros in SIMO acoustic system;
(ii) develop feasible and practical algorithms to avoid performance degradation caused by common zeros;
(iii) extend and apply the work to MIMO systems. Several approaches have been formulated
to overcome the common zeros problem in blind system identification, such as
(i) Channel decomposition scheme, (ii) Selective-tap updating scheme,
(iii)Z-plane mapping scheme, (iv) Super-Resolution scheme, (v) Hybrid method. |
|
| |
| Andy Khong, Patrick Naylor |
| A class of selective tap algorithms is developed and their steady-state misalignment is analyzed for acoustic echo cancellation. These algorithms update a subset of filter coefficients at each sample iteration and are originally developed for complexity reduction. A novel approach to reduce interchannel coherence based on tap selection for stereophonic acoustic echo cancellation is introduced. The proposed exclusive-maximum (XM) tap-selection algorithm jointly reduce interchannel coherence and degradation in convergence performance due to tap selection. This is achieved by selecting exclusive filter coefficients corresponding to the maximum tap input energies of the two channels. |
|
| |
| Jimi Wen , Patrick Naylor |
| This project aims to develop standard novel measure for the perceived reverberation, and investigates some de-reverberation speech enhancement methods using acoustic modelling. We aim to develop de-reverberation algorithms that translate audio or speech of different acoustic states to another and are evaluated and improved using the perceptual measures developed. |
|
| |
| Rehan Ahmad , Patrick Naylor |
| Research is being carried out, as part of this project, on acoustic dereverberation systems. These involve identification of reverberant channels such that those could be inverted and utilized to cancel the effect of reverberation. Hence research essentially focuses upon multichannel blind channel identification (BCI) methods. |
|
| Vinesh Bhunjun, Mike Brookes |
| This project investigates noise estimation in the eigenspectral domain for single-channel speech enhancement. It involves exploiting the differences in the variation of speech and noise in time and in their representation in the transform domain. |
|
| |
| Jon Gudnason, Mike Brookes |
| The difference between the distribution of training and test data is studied for speech and speaker recognition. The aim is to apply the appropriate model to the extracted coefficients and improve recognition quality and search time. |
|
| |
|
| Last updated:
September 18, 2009
|
|