A REVIEW ON MULTIMODAL SPEAKER RECOGNITION
DOI:
https://doi.org/10.22159/ajpcr.2017.v10s1.19761Keywords:
Speaker Recognition, Multimodal Speaker recognition, throat microphone, bone microphone, VQ, GMMAbstract
A review on multimodal speaker recognition (SR) is being presented. For many decades the speaker recognition has been studied and still it has grabbed the interest of many researchers. Speaker recognition includes of two levels –system training and system testing. The robustness of the speaker recognition system depends on the training environment and testing environment as well as the quality of speech .Air conducted (AC) Speech is a source from which speaker is recognized by extracting the features. The performance of the speaker recognition system depends on AC speech. further to improve the robustness and accuracy of the SR system various other sources(Modals) like Throat Microphone ,Bone Conduction Microphone, array of microphones,Non Audible murmur, non auditory information like video are used in complementary with standard AC microphone. This paper is purely a review on SR and various complimentary modals.
Downloads
References
Joseph P. Campbell, jr., senior member, IEEE Speaker Recognition: A Tutorial†, proceedings of the IEEE, vol. 85, no. 9, september 1997.
Marcos Faundez-Zanuy,Enric Monte-Moreno,â€State-of-the-Art in Speaker Recognitionâ€,, IEEE abre systems magazine, may 2005.
Mubeen, N., Shahina, a., Khan, a. N., & Vinoth, G. (2012). Combining spectral features of standard and throat microphones for speaker identification. International Conference on Recent Trends in Information Technology, ICRTIT 2012, 119–122.
Kinnunen, T., & Li, H. (2010). An overview of text-independent speaker recognition: From features to supervectors. Speech Communication, 52(1), 12–40.
Sapijaszko, G. I., & Mikhael, W. B. (2012). An overview of recent window based feature extraction algorithms for speaker recognition. Midwest Symposium on Circuits and Systems, 880–883.
Ramachandran, R. P., Farrell, K. R., Ramachandran, R., & Mammone, R. J. (2002). Speaker recognition—general classifier approaches and data fusion methods. Pattern Recognition, 35, 2801–2821.
Rahman, M. S., & Shimamura, T. (n.d.). A Study on Amplitude Variation of Bone Conducted Speech Compared to Air Conducted Speech.
McBride, M., Tran, P., Letowski, T., & Patrick, R. (2011). The effect of bone conduction microphone locations on speech intelligibility and sound quality. Applied Ergonomics, 42(3), 495–502.
Srinivasan and Patrick Kechichian, I.Sriram,â€Enhancement, a. s. (2012). robustness analysis of speech enhancement using a bone conduction microphone†– preliminary results (September), 4–6.
Tran, P., Letowski, T., & McBride, M. (2008). Bone conduction microphone: Head sensitivity mapping for speech intelligibility and sound quality. ICALIP 2008 - 2008 International Conference on Audio, Language and Image Processing, Proceedings, 107–111.
Tsuge, S., Koizumi, D., Fukumi, M., & Kuroiwa, S. (2009). Speaker verification method using bone-conduction and air-conduction speech. ISPACS 2009 - 2009 International Symposium on Intelligent Signal Processing and Communication Systems, Proceedings, (Ispacs), 449–452.
Yamasaki, N., & Shimamura, T. (2010). Accuracy Improvement of Speaker Authentication in Noisy Environments Using Bone-Conducted Speech, 197–200.
Weng, Z., Li, L., & Guo, D. (2010). Speaker recognition using weighted dynamic MFCC based on GMM. Proceedings - 2010 International Conference on Anti-Counterfeiting, Security and Identification, 2010 ASID, 285–288. doi:10.1109/ICASID.2010.5551341
R.M Gray, Vector Quantization,â€IEEE ASSP Magazine, pp. 4-29, April 1984
A. Likas, Vlassis and J. J. Verbeek, The global k-means clustering algorithm,†in Pattern Recognition , vol. 36, no. 2, pp. 451-461
S. S. Khan and A. Ahmed, Cluster center initialization for K-means algorithm,†in Pattern Recognition Letters, vol. 25, no. 11
Cherifa S. and Messaoud R,New Technique to use the GMM in Speaker Recognition System (SRS)â€, International Conference on Computer Applications Technology, pp. 1-5,2013.
Published
How to Cite
Issue
Section
The publication is licensed under CC By and is open access. Copyright is with author and allowed to retain publishing rights without restrictions.