PEVD for Speech Enhancement

Examples of Speech Enhancement using Polynomial Eigenvalue Decomposition (PEVD)

Researchers: Vincent W. Neo, Christine Evers, Patrick A. Naylor

These listening examples are supplementary materials for [1].

The clean speech and noise signals are taken from the TIMIT corpus [2] and Noisex [3] database respectively. The algorithms used in the comparison include

  1.  log-MMSE [4],
  2.  the multi-channel Wiener filter (MWF) [5] in [6] which uses the Relative Transfer Function (RTF) estimator in [7] and the noise estimator in [8],
  3. Oracle-MWF (O-MWF) that uses the clean speech signal (ground truth) [5] and
  4. PEVD which uses the Sequential Matrix Diagonalisation algorithm [9].

References

[1] V. W. Neo, C. Evers, and P. A. Naylor, “Speech enhancement using polynomial eigenvalue decomposition,” in Proc. IEEE Workshop on Applications of Signal Process. to Audio and Acoust. (WASPAA), Oct. 2019.

[2] J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallet, N. L. Dahlgren, and V. Zue, “TIMIT acoustic-phonetic continuous speech corpus,” Linguistic Data Consortium (LDC), Philadelphia, Corpus, 1993.

[3] A. Varga and H. J. M. Steeneken, “Assessment for automatic speech recognition II: NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems,” Speech Commun., vol. 3, no. 3, pp. 247–251, July 1993.

[4] Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error log-spectral amplitude estimator,” IEEE Trans. Acoust., Speech, Signal Process., vol. 33, no. 2, pp. 443–445, 1985.

[5] S. Doclo and M. Moonen, “GSVD-based optimal filtering for single and multimicrophone speech enhancement,” IEEE Trans. Signal Process., vol. 50, no. 9, pp. 2230–2244, Sept. 2002.

[6] W. Xue, A. H. Moore, M. Brookes, and P. A. Naylor, “Modulation-domain multichannel Kalman filtering for speech enhancement,” IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 26, no. 10, pp. 1833–1847, Oct. 2018.

[7] Varzandeh, Reza, Maja Taseska, and E. Habets, “An iterative multichannel subspace-based covariance subtraction method for relative transfer function estimation.” Hands-free Speech Communications and Microphone Arrays (HSCMA), 2017.

[8] Souden, M., Jingdong Chen, J. Benesty, and S. Affes,” An integrated solution for online multichannel noise tracking and reduction,” IEEE Trans. on Audio, Speech, and Lang. Process., vol. 19, no. 7, Sept. 2011.

[9] S. Redif, S. Weiss, and J. G. McWhirter, “Sequential matrix diagonalisation algorithms for polynomial EVD of para-Hermitian matrices,” IEEE Trans. Signal Process., vol. 63, no. 1, pp. 81–89, Jan. 2015.

 

Related Works on PEVD Algorithms

[10] J. G. McWhirter, P. D. Baxter, T. Cooper, S. Redif, and J. Foster, “An EVD algorithm for para-Hermitian polynomial matrices,” IEEE Trans. Signal Process., vol. 55, no. 5, pp. 2158–2169, May 2007.

[11] V. W. Neo and P. A. Naylor, “Second order sequential best rotation algorithm with Householder transformation for polynomial matrix eigenvalue decomposition,” in Proc. IEEE Int. Conf. on Acoust., Speech and Signal Process. (ICASSP), 2019.

[12] S. Redif, S. Weiss, and J. G. McWhirter, “An approximate polynomial matrix eigenvalue decomposition algorithm for para-Hermitian matrices,” in Proc. Intl. Symp. on Signal Process. and Inform. Technology (ISSPIT), 2011, pp. 421–425.