VLSI Design for SVM-Based Speaker Verification System
Sreemathy T G
Lecturer in Electronics, Govt Polytechnic College, Kunnamkulam
Download PDFAbstract
This brief presents the chip implementation of a support vector machine (SVM)-based speaker verification system. The proposed chip comprises a speaker feature extraction (SFE) module, an SVM module, and a decision module. The SFE module performs autocorrelation analysis, linear predictive coefficient (LPC) extraction, and LPC-to-cepstrum conversion. The SVM module includes a Gaussian kernel unit and a scaling unit. The purpose of the Gaussian kernel unit is first to evaluate the kernel value of a test vector and a support vector. Four Gaussian kernel processing elements (GK-PEs) are designed to process four support vectors simultaneously. Each GK-PE is designed in the pipeline fashion and is capable of performing 2-norm and exponential operations. An enhanced CORDIC architecture is proposed to calculate the exponential value. As well as the Gaussian kernel unit, a scaling unit is also developed for use in the SVM module. The scaling unit is used to perform scaling multiplications and the remaining operations of SVM decision value evaluation. Finally, the decision module accumulates the frame scores that are generated by all of the test frames, and then compare it with a threshold to see if the test utterance is spoken by the claimed speaker. This designed chip is characterized by its high speed and its ability to handle a large number of support vectors in the SVM. The prototype chip is a semicustom chip that is fabricated using Taiwan Semiconductor Manufacturing Company 0.90-nm CMOS technology on a die with a size of - 7.9 mm X 7.9 mm.
Keywords: VLSI Design; SVM; CORDIC; Gaussian kernel; CMOS
- B. H. Juang and T. H. Chen The past, present, and future of speech processing. In IEEE Signal Process. Mag., vol. 15, no. 3, pp. 24-48 ( May 1998)
- J. C. Wang, C. H. Yang, J. F. Wang, and H. P. Lee Robust speaker identification and verification In IEEE Comput. Intell. Mag., vol. 2, no. 2, pp. 52-59, (May 2007).
- W. M. Campbell, J. P. Campbell, D. A. Reynolds, E. Singer, and P. A. Torres- Carrasquillo Support vector machines for speaker and language recognition In Comput. Speech Lang., vol. 20, nos. 2-3, pp. 210-229, (2006).
- C. H. Yang, J. C. Wang, J. F. Wang, C. H. Wu, and F. M. Li VLSI architecture and implementation for speech recognizer based on discriminative Bayesian neural network In IEICE Trans. Fundam. Electron., Commun. Comput. Sci., vol. E85-A, no. 8, pp. 1861-1869, (Aug. 2002.).
- J. F. Wang, J. C. Wang, H. C. Chen, T. L. Chen, C. C. Chang, and M. C. Shih Chip design of portable speech memopad suitable for persons with visual disabilities In IEEE Trans. Speech Audio Process., vol. 10, no. 8, pp. 644-658, (Nov. 2002).
- V. Vapnik Statistical Learning Theory In New York, NY, USA: Wiley, (1998).