Mark Thomas's Website

Projects

SCENIC: Acoustic signal processing applications, particularly those based on array processing, is often highly-sensitive to the complex propagation phenomena in which they operate. This propagation phenomena, termed reverberation, is often seen as a liability that should be removed; however, nature has taught us that reverberation can be used as an asset enabling complex navigational tasks. The Self-Configurable ENvironment-aware Intelligent aCoustic sensing (SCENIC) project aims to turn the acoustic response from a liability to an asset with accurate modelling of the environment with thorough understanding of acoustic propagation.

Data-Driven Voice Source Modelling and Estimation: The human voice consists of a source signal, created by vibration of the glottis, and spectral filtering, effected by the vocal tract. A number of mathematically- and physically-motivated models exist to model the source signal; however, they are flawed in the large number of control parameters and their inability to reproduce the whole gamut of source signals. This project proposes a class of data-driven models with a view to application in coding, artificial bandwidth extension and synthesis. Audio samples can be found here.

YAGA: The DYPSA algorithm detects glottal closing instants (GCIs) from speech signals, employing linear predictive coding, a group delay function and dynamic programming to provide accurate GCI detection. The Yet Another GCI/GOI Algorithm (YAGA) extends the approach by applying robust voice source estimation, multiscale wavelet products and additional dynamic programming cost functions to detect both glottal closing instants and glottal opening instants (GOIs). There is no universally-accepted defintion of the GOI; this project aims to seek new methods to define more precisely the GOI, particularly in the context of closed-phase analysis.

SIGMA: The Electroglottograph (EGG) measures the electrical conductance of the glottis. The EGG signal is free from filtering by the vocal tract and is a reliable source from which glottal closing instants (GCIs) and opening instants (GOIs) may be detected. The SIGMA (Singularity In EGG by Multiscale Analysis) algorithm relies on multiscale wavelet products and Gaussian Mixture Modelling (GMM) to accurately detect GCIs and GOIs from EGG signals, providing a reference against which speech-based detection algorithms may be evaluated. A MATLAB implementation can be found in Mike Brookes's Voicebox.

Multichannel DYPSA: The DYPSA algorithm detects glottal closing instants (GCIs) from speech signals. Although it is robust on clean speech, even a small amount of reverberation can severely impair results. Multichannel DYPSA is a multichannel extension that uses interchannel correlation of 'candidate' GCIs to provide more accurate detection in reverberant environments.

SMERSH: The Spatiotemporal Method for Enhancement of Reverberant SpeecH (SMERSH) is a dereverberation algorithm targeted for use on speech signals. By exploiting the pseudoperiodic nature of voiced speech, the spatial diversity of multichannel recordings and adaptive dereverberation filters, reverberation components can be suppressed while reinforcing the wanted speech signals. Audio samples can be found here.

PSOLA: The Pitch-Synchronous OverLap-Add algorithm (PSOLA) is a method for concatenating cycles of voiced speech to vary pitch and timescale independently of one another. This project extends the technique by making it suitable for both voiced and unvoiced speech through the use of a novel voiced/unvoiced/silence detector and phoneme-specific time scaling. Audio samples can be found here.

MLS: Maximum-Length Sequences (MLS) are pseudorandom signals which can be used to excite a system, the output of which can be quickly inverted to determine the system's impulse response. They are spectrally white and have a near unit-impulse autocorrelation function. This method of testing is preferable to impulse testing real-world systems as signal energy is spread out over time, providing greater SNR and reducing nonlinear effects. This package includes a short tutorial, some MATLAB examples and the source code. My Final Year MEng Project entitled A Novel Loudspeaker Equaliser made extensive use of MLS for loudspeaker measurements.
iECG Portable ECG Datalogger: The 3rd Year Group Project for the Electrical and Electronic Engineering BEng/MEng course was to design a portable ECG datalogger, named the iECG. The patient wears a device that continuously records an ECG of up to 12 channels onto solid-state memory. Data is uploaded at regular intervals by the patient remotely via the web onto a central server. A software tool is used by the physician to review the data which is analysed offline for evidence of common heart complaints. See: Inception Report, Design Report, Commercial Report, Photograph 1, 2, 3, 4.