Anti-spoofing of Voice Print Systems (11 January 2019 – 10 January 2021)

Robust speaker recognition has been widely adopted as a means of biometrics. Users increasingly question whether speaker recognition is secure in practical applications. From engineering point of view, robustness and security in speaker recognition can be two contradicting requirements. To increase the robustness, we expect a system to tolerate speaker variations, e.g. aging, stress, vocal effort, and acoustic variations, e.g. channel and noise, while synthetic voice represents a type of variation. To ensure the security, we expect a system to reject voice inputs of unwanted variations, such as replay, synthetic and impersonated speech. In this project, we study the theory and practice of anti-spoofing countermeasures to safeguard speaker recognition systems. This project was funded in 2014, which contributed to the formulation of ASVspoof 2015, and subsequent international evaluations.

We focus on generalized countermeasures that are effective when the attacks are unknown from the view of practical applications. In this regard, we have explored few novel countermeasures that are based on long range information derived using long-term constant-Q transform, subband knowledge, instantaneous phase and octave spectra information. Further, we also have explored deep feature representation based frameworks that are effective for detection of spoofing attacks. Our proposed countermeasures submitted to ASVspoof 2019 challenge are found to be effective to deal with both logical and physical access attacks.

Project Duration: 11 January 2019 – 10 January 2021

PUBLICATIONS

Journal Articles

  • Jichen Yang, Rohan Kumar Das and Haizhou Li, “Significance of Subband Features for Synthetic Speech Detection”, IEEE Transactions on Information Forensics and Security, 15, 2020, pp. 2160-2170. [link] [Article In-process]

Conference Articles

  • Zhenzong Wu, Rohan Kumar Das, Jichen Yang and Haizhou Li, “Light Convolutional Neural Network with Feature Genuinization for Detection of Synthetic Speech Attacks”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 1101-1105. [link]
  • Rohan Kumar Das, Jichen Yang and Haizhou Li “Assessing the Scope of Generalized Countermeasures for Anti-spoofing” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2020, Barcelona, Spain, May 2020. [link]
  • Rohan Kumar Das, Jichen Yang and Haizhou Li, “Long Range Acoustic and Deep Features Perspective on ASVspoof 2019”, in Proc. IEEE Automatic Speech Recognition Understanding (ASRU) Workshop 2019, Sentosa Island, Singapore, December 2019. [link]
  • Rohan Kumar Das, Jichen Yang and Haizhou Li, “Long Range Acoustic Features for Spoofed Speech Detection”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 1058-1062. [link]
  • Yitong Liu, Rohan Kumar Das and Haizhou Li, “Multi-band Spectral Entropy Information for Detection of Replay Attacks”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2019, Lanzhou, China, November 2019. [link]
  • Rohan Kumar Das and Haizhou Li, “Instantaneous Phase and Excitation Source Features for Detection of Replay Attacks,” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2018, Honolulu, Hawaii, USA, November 2018, pp. 1030-1037. [link]
  • Jichen Yang, Rohan Kumar Das and Haizhou Li, “Extended Constant-Q Cepstral Coefficients for Detection of Spoofing Attacks,” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2018, Honolulu, Hawaii, USA, November 2018, pp. 1024-1029. [link]