Neuromorphic Speech Processing
General objective: Description of functional circuits in brain for speech perception and production. Foundations for advanced voice quality analysis in neurodegenerative disease monitoring.
Detailed description: Design of voice processing tools for the detection and monitoring the organic and neurologic pathology in Speech: It is well known that many neurological diseases leave a fingerprint in voice and speech production. The dramatic impact of these pathologies in life quality is a growing concern. Many techniques have been designed for the detection, diagnose and monitoring the neurological disease. Most of them are costly or difficult to extend to primary attention medical services. Some neurological diseases can be traced at the level of voice production. The detection procedure would be based on a simple voice test. The availability of advanced tools and methodologies to monitor the organic pathology of voice would facilitate the implantation of these tests.
Design of modeling tools for speech perception in higher auditory pathways for phonetic unit representation spaces and phonemic parsing: current trends in the search for improvements in well-established technologies imitating human abilities, as speech perception, try to find inspiration in the explanation of certain capabilities hidden in the natural system which are not yet well understood. A typical case is that of speech recognition, where the semantic gap going from spectral time-frequency representations to the symbolic translation into phonemes and words, and the construction of morpho-syntactic and semantic structures find many hidden phenomena not well understood yet.
Based on these premises the Laboratory NEUVOX has developed several voice analysis tools devoted to the organic and neurologic disease monitoring (biomet®Phon) and speech discourse (biomet®Neur) which are being currently used in the following fields:
- Organic and neurologic disease monitoring from voice tests
- Biometrical description of the speaker with forensic capability
- Acoustic quality analysis of the singing voice
- Phonation and speech rehabilitation
Several applications have been developed to monitor voice quality in medical, forensic, singing and emotional studies involving phonation. The following picture shows the graphical user interface of medical use:
Graphical User Interface for Advanced Voice Quality Analysis used in Organic and Neurologic Pathology Detection and Monitoring from Phonation.
Advanced Speech Processing Applications for Medical Voice Quality Analysis as BioMet®Phon (refer to the above explanations).
The application allows browsing a patient’s database or produce on-the-fly voice recordings and analysis on a three-mouse-click procedure, fast and of simple use. The application estimates up to 68 parameters of interest, including distortion (jitter, shimmer, noise-to-harmonic, mucosal wave ratio), cepstral, spectral, biomechanical (dynamic mass and stiffness of the vocal fold body and cover), open, return and close quotients, contact and permanent gap defects, and tremor in voice. Graphical screens show the temporal profile of the glottal source (pressure wave developing at the vocal folds, see right hand side plots). These parameters are contrasted against a normative database and allow the production of longitudinal studies as the one shown in the figure below. Specific reports are produced for clinical use.
Specific study case of pre-treatment compared with three post-treatment inspections.
It corresponds to a female patient 65 years-old who suffered from post-Thyroidectomy Vocal Fold Recurrent Paralysis (pTVFRP). The treatment consisted in infiltration of fat from the patient in the vocal fold. The patient’s voice was examined once before the intervention (pre: March) and three times after the intervention (post1: May; post2: September; and post3: November) all over 2011. All the parameters presented in the study convey a kind of pathological behavior when separating from unity (normative parameter median). It may be seen that certain parameters, as for example Cover Stiffness (43) associated to vocal fold pathological behavior correct their deviations strongly from the pre examination after treatment.
Other fields of application are the monitoring of neurological disease from phonation and speech, as in the following template where measurements on the vowel triangle are used to assess Alzheimer’s disease and amyotrophic lateral sclerosis progression.
Specific longitudinal study case of ALS showing the indices of vowel triangle shrinking and asymmetric swing of the patient as disease progresses.
- Neuromorphic Speech Processing Laboratory
- Cognitive and Computational Neuroscience
- (“Robert W. Newcomb” Oral Communication Lab. FI-UPM)
Contact: Pedro Gómez Vilda