Modernizing "xkl", a legacy software for detailed acoustic analysis of speech
As a dedicated researcher in the field of speech analysis, my current project revolves around revamping the xkl software developed by Dennis Klatt at MIT in the 1980s. This pioneering software offers exceptional capabilities in estimating the temporal and spectral properties of speech segments, which are crucial for understanding human communication. However, due to its lack of support for modern computing platforms, its potential has been limited over the past decades. My objective is to modernize and update the xkl software, making it compatible with contemporary computing systems and programming languages. By doing so, I aim to unleash the true power of this tool, enabling a new generation of researchers to delve into the intricate properties of speech and uncover new insights. Through advanced signal processing techniques and machine learning algorithms, I hope to improve the accuracy and precision of spectral estimation in speech segments, further enhancing our understanding of speech production and perception. Ultimately, the success of this project will lead to the creation of a user-friendly and robust version of the xkl software, facilitating its widespread use in the research community. The outcomes of this endeavor hold the potential for significant contributions to speech analysis and pave the way for new discoveries in the fascinating realm of human communication and language.
Statistical Analysis of Internal Landmarks in LaMIT and Automatic Landmark Identification
The goal of this research project is two-fold: first, to conduct a statistical analysis of internal landmarks within the LaMIT framework, and second, to develop a system for automatic landmark identification. The project aims to apply the Lexical Access model conceived by Stevens [Stevens, K. N. (2002). “Toward a model for lexical access based on acoustic landmarks and distinctive features,” J. Acoust. Soc. of Am., 111(4):1872–1891] to the Italian language and enhance it with automatic detection capabilities. The project proposes to apply Stevens’ Lexical Access model to the Italian language and statistically analyze the internal landmarks within the LaMIT framework. This analysis aims to provide insights into the universal and language-specific features of acoustic landmarks. Additionally, the project aims to develop a system for the automatic identification of landmarks in the Italian language, designed to imitate the listener’s process of deriving intended words from a speaker, involving creating a speech recognizer based on acoustic landmarks and distinctive features. Modeling this process requires hypothesizing how lexical items are stored in memory. Stevens’ model postulates that lexical items are stored in memory according to distinctive features, hierarchically organized. The model highlights the importance of abrupt acoustic events, named landmarks, in the perception process, with detection of landmarks being primary in human perception, corresponding to the first phase of recognition, followed by further processing of the temporal area around the landmark by the listener. The expected outcomes include understanding the universal and language-specific features by applying the Lexical Access model to Italian, uncovering universal, language-independent aspects and understanding specific features of the Italian language. Additionally, developing an Italian speech recognizer based on landmarks and acoustic features detection will establish a significant reference system for the speech community in Europe and beyond, offering access to data and algorithms to support further research and applications in speech perception. The project ambitiously aims to understand whether language-independent mechanisms exist in the above process and to discriminate them against language-dependent ones, hypothesizing that cues to landmarks may be language-independent. Supported evidence for universal strategies of speech perception would lead to a significant impact. The acoustical analyses will use the xkl software tool developed by Dennis Klatt at MIT. This project holds the potential to significantly impact the field of speech perception by providing evidence for universal strategies and developing valuable resources for the Italian language.
Accents Identification using Modern Machine Learning and Deep Learning Techniques
The goal of this research project is to develop an advanced system for the automatic identification of foreign accents in speech signals using cutting-edge deep learning techniques. This system aims to enhance the robustness and accuracy of Foreign Accent Identification (FAID) systems by addressing challenges in multi-class modeling and feature selection. Non-native accents are characterized by distinct pronunciations, prosody, and voice characteristics, posing significant challenges, particularly in multi-class modeling. Traditional models struggle with high performance and handling computational challenges in multi-dimensional and unbalanced datasets.