Virtual Assistants and Dysarthria

Active: 2018-2019

The usage of voice-based virtual assistants (e.g., Siri or Google Assistant) is growing, and their spread was most possible by the increasing capabilities of natural language processing, and generally has a positive impact on device accessibility, e.g., for people with disabilities. However, people with dysarthria or other speech impairments may be unable to use these virtual assistants with proficiency, within their smartphones or in their smart homes. Is this true? To which extent people with dysarthria can be understood and get consistent answers from these virtual assistants?

We answered these questions by analyzing the most common virtual assistants when faced with dysarthric speech in two different languages: English and Italian. Moreover, we adopted separate methodologies for each language.

"Hey Siri, do you understand me?": English Edition

For the English language, we investigated to which extent people with dysarthria can use and be understood by the three most common virtual assistants: Siri, Google Assistant, and Amazon Alexa. Starting from some suitable sentences in the TORGO database of dysarthric articulation, we analyzed and discussed the differences between such assistants. Preliminary results show that the three virtual assistants have comparable performance, with an accuracy of the recognition in the range of 50-60%. We notice, however, that the three assistants have a different approach for replying to the user. While Siri always tries to answer any request, even if it does not recognize any word, Amazon Alexa and Google Assistant provide a response if they recognize at least some words.

More details can be found in this paper:

Ballati Fabio, Corno Fulvio, De Russis, Luigi. 2018. "Hey Siri, do you understand me?": Virtual Assistants and Dysarthria. In Intelligent Environments 2018: Workshop Proceedings of the 14th International Conference on Intelligent Environments, pp. 557-566. DOI: http://doi.org/10.3233/978-1-61499-874-7-557

Assessing Virtual Assistant Capabilities with Italian Dysarthric Speech

For the Italian language, we investigated to which extent people with ALS-induced dysarthria can be understood and get consistent answers by three widely used smartphone-based assistants: Siri, Google Assistant, and Cortana. We do not consider Alexa, in this case, since it is more commonly used in a smart home scenario.
In particular, we focus on the recognition of Italian dysarthric speech to study the behavior of the virtual assistants with this specific population for which there are no relevant studies available. We collected and recorded suitable speech samples from people with dysarthria in a dedicated center of the Molinette hospital, in Turin, Italy. Starting from those recordings, the differences between such assistants, in terms of speech recognition and consistency in answer, are investigated and discussed. Results highlight different performance among the virtual assistants. For speech recognition, Google Assistant is the most promising, with around 25% of word error rate per sentence. Consistency in answer, instead, sees Siri and Google Assistant provide coherent answers around 60% of times.

Checkout the following paper for additional details:

Ballati Fabio, Corno Fulvio, De Russis Luigi. 2018. Assessing Virtual Assistant Capabilities with Italian Dysarthric Speech. In Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '18), ISBN: 978-1-4503-5650-3. DOI: http://doi.org/10.1145/3234695.3236354

Publications

Ballati Fabio, Corno Fulvio, De Russis Luigi. 2018. Assessing Virtual Assistant Capabilities with Italian Dysarthric Speech. In Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '18), ISBN: 978-1-4503-5650-3. DOI: http://doi.org/10.1145/3234695.3236354
Ballati Fabio, Corno Fulvio, De Russis, Luigi. 2018. "Hey Siri, do you understand me?": Virtual Assistants and Dysarthria. In Intelligent Environments 2018: Workshop Proceedings of the 14th International Conference on Intelligent Environments, pp. 557-566. DOI: http://doi.org/10.3233/978-1-61499-874-7-557