Frederik Gladhorn

Qt Speech (Text to Speech) is here

Published Friday January 20th, 2017
14 Comments on Qt Speech (Text to Speech) is here
Posted in Accessibility, Dev Loop, News, Qt, Releases | Tags:

I’m happy that with Qt 5.8.0 we’ll have Qt Speech added as a new tech preview module. It took a while to get it in shape since the poor thing sometimes did not get the attention it deserved. We had trouble with some Android builds before that backend received proper care. Luckily there’s always the great Qt community to help out.
Example application show Qt Text to Speech
What’s in the package? Text to speech, that’s about it. The module is rather small, it abstracts away different platform backends to let you (or rather your apps) say smart things. In the screen shot you see that speech dispatcher on Linux does not care much about gender, that’s platform dependent and we do our best to give you access to the different voices and information about them.

Making the common things as simple as possible with a clear API is the prime goal. How simple?
You can optionally select an engine (some platforms have several). Set a locale and voice, but by default, just create an instance of QTextToSpeech and connect a signal or two. Then call say().

m_speech = new QTextToSpeech(this);
connect(m_speech, &QTextToSpeech::stateChanged,
        this, &Window::stateChanged);
m_speech->say("Hello World!");

And that’s about it. Give it a spin. It’s a tech preview, so if there’s feedback that the API is incomplete or we got it all wrong, let us know so we can fix it ASAP. Here’s the documentation.

I’d like to thank everyone who contributed, Maurice and Richard at The Qt Company, but especially our community contributors who were there from day one. Jeremy Whiting for general improvements and encouragement along the way (I bet he almost gave up on this project). Samuel Nevala and Michael Dippold did implement most of the Android backend, finally getting things into shape. Thanks!

You may wonder about future development of Qt Speech. There are some small issues in the Text to Speech part (saving voices and the current state in general should be easier, for example). When it comes to speech recognition, I realized that that’s a big project, which deserves proper focus. There are many exciting things that can and should be done, currently I feel we need proper research of all the different options. From a simple command/picker mode to dictation, from offline options and native backends to the various cloud APIs. I’d rather end up with a good API that can wrap all of these in a way that makes sense (so probably a bunch of classes, but I don’t have a clear picture in my mind yet) than rushing it. I hope we’ll get there eventually, because it’s certainly an area that is important and becoming more and more relevant, but I assume it will take some time until we have completed the offering.

Do you like this? Share it
Share on LinkedInGoogle+Share on FacebookTweet about this on Twitter

Posted in Accessibility, Dev Loop, News, Qt, Releases | Tags:

14 comments

Matheus Catarino says:

Excellent news as it will now bring the possibility of bringing new fully open source voice synthesizers. This is another triumphant advance of Qt.

As a physical handicap I am flattered by the fact of the importance of the use of accessibility in this great project.

Congratulations!

Frederik Gladhorn Frederik Gladhorn says:

Thanks 😀

Nikita Skovoroda says:

Will there be a built-in QML API?

Frederik Gladhorn Frederik Gladhorn says:

Hi Nikita,
I think it would be relatively easy to expose the classes to QML, it’s certainly something we discussed, but decided to first get things in shape with C++.
So yes, I could imagine adding a QML API, but currently it is not the highest priority to me. I’m very much looking forward to hearing if people use the module. If there are many people asking for QML APIs, we’ll make sure to add them.
Cheers,
Frederik

Ivan Azarnyi says:

I agree with Nikita.
I’d like to see this module as QML widget, because it would be more usefull on Mobile platform.
But Desktop support is more important in my opinion. This module allows developers to create more apps for people with disabilities and add support to existing application.
Thank you for this future.
Waiting for QML version.

Frederik Gladhorn Frederik Gladhorn says:

Hi Ivan, thanks for the feedback 🙂 I’m by no means opposed to adding the QML bits and I think it’s very feasible. I do think it’s good to get the module out and see if the C++ side makes sense and then we can go for the next round of improvements/additions.

Luciano says:

Which backend is this using on Linux?

Frederik Gladhorn Frederik Gladhorn says:

@Luciano: currently it’s using speech-dispatcher on Linux. That comes with a few problems, but since speech-dispatcher development has been going forward a bit, I’m hopeful that it will continue to work. It would be quite easy to add other backends (espeak would be an option, we have some code for flite).

Robin Lobel says:

Is it technically possible (in a later revision) to output the speech as an audio buffer data rather than direct play on the speakers?

Frederik Gladhorn Frederik Gladhorn says:

@Robin: I was pondering the grabbing of the audio buffer as well. Currently we chose to go the easy route that will simply work. The problem is that the backends on the platforms are so different that this may become a maintenance nightmare. I don’t think it would be good to support it only on some platforms. What would be your use case?

Robin Lobel says:

No specific use case for now, just thinking ahead, it might be useful to be able to access the raw audio data, optionnaly apply some postprocess on it, and output it through your own audio playback engine. For instance, audio application may have an already running audio loop.

Maurice Kalinowski Maurice Kalinowski says:

This also depends on whether you have access to the raw buffer. It’s been a while I worked on the WinRT backend, but there you did not have a chance to acquire it. So it would need to be a platform specific setting.

Volker H. says:

Great work, congratulations!
Two questions:
We are wondering if QT Speech can be used to build a commercial application?
Are you providing phoneme sequences to drive animation?
Kind regards,
Volker

Frederik Gladhorn Frederik Gladhorn says:

@Volker yes, you are free to use Qt Speech in a commercial application. Please be aware that we don’t promise API compatibility until it’s out of tech preview, so if you later update your Qt version, you may have to do some porting.
Due to us wrapping the platform APIs we don’t provide any updates during the output at the moment. It could be an optional feature on the platforms/synthesizers where there is support for this type of feature.
Currently Qt Speech doesn’t try to be the final answer to all TTS needs but just allows the very common use cases. It’s great to hear the feedback and requests from everyone, thanks!

Commenting closed.

Get started today with Qt Download now