Follow

Anyone know of any *working* Linux speech recognition software (dictation, not mouse control)???

Something that doesn't require being online and sending all my voice to the cloud maybe???

The situation looks dismal:

unix.stackexchange.com/questio

Best solution so far looks like VMWare + Windows license + buy a license of Dragon but it's impossible to fathom doing that.

My hands are in pain, all the time. Yes I'm seeing a doctor about it. But I need solutions *now* damn it.

@wohali This solution may not be any more palatable, but I found the voice control and diction on the Mac (using a good microphone setup) was livable when I was having hand problems.

@gedvondur Thanks. I'm aware of Mac and Windows based solutions. Those don't help me on my laptop (without a VM, and possible license violation in the case of macOS.)

@gedvondur no, it's a good suggestion. an MSFT employee encouraged me to do a VM with Win10 in it.

I need to come up with something, soon. My hands are in a lot of pain.

@wohali @gedvondur on that note I just installed dictation on my Mac and it's not all that good

@wohali @gedvondur it works better to have a nice clean microphone up close but I'll attend to him and too muchit works better to have a nice clean microphone up close but I'll attend to him and too much and mastodon keeps stumbling text for some reason and mastodon keeps stumbling text for some reason

@fluffy @wohali Well, it's not as good as Dragon, no. Nothing is.

I've also found the quality of your microphone makes a great deal of difference. Well and of course the environment noise.

@gedvondur @wohali I was using a professional microphone in a quiet room connected to my studio audio interface. If that microphone can't handle it, nothing can

@fluffy @wohali I also found that you have to adapt to it. Compose the sentence in your head, then say it, no stumbling, no going back. It takes some time. But dictate on the Mac is better than nothing.

@gedvondur @wohali that long rambly-looking spew is literally me saying something carefully, slowly, and deliberately with no mess-ups, but for whatever reason it inserted three different versions with different levels of correction and this was even in Safari. how can I expect anything else to work right?

and of course this is useless for writing code

@usul Still requires an online connection to work. Their offline version is stalled out with big problems :(

It's the most promising, I agree, though the amount of HW you need to run it locally is astounding...

@wohali @usul there's no online version, and I don't know about the stalled offline issues... Care to be more explicit?

@usul @376b78fc7223 sorry, I've been sick. I think I confused deepspeech with mycroft's packaging.

Is it true that it can only do ~5s of audio at a time? Is there any useful control app for this or do I have to roll my own?

@wohali @usul Can you be more specific « any useful control app » ? Regarding the 5s it's an old limitation, it might not be the case anymore ...

@wohali @usul What are you trying to achieve exactly ? DeepSpeech provides an API available as C and other languages, as well as models (english for only now) and training tooling. It's only targetting speech to text.

@376b78fc7223 @usul See the top of this thread. I have RSI issues and need a dictation app for regular use, offline, on my laptop.

I don't want to write a program to make this happen; I haven't the energy.

DeepSpeech may simply be too low level for my needs today.

@wohali @usul Sorry, I have no idea what RSI stands for. Yes, maybe deepspeech is too low leve for you ...

@376b78fc7223 @usul Repetitive Stress injury. Tendonitis. Carpal Tunnel Syndrome.

Problems using my hands.

@376b78fc7223 @usul Sorry :) not being aggressive. Just overexplaining :)

Thanks for the help, it's really appreciated!

@wohali @usul I guess getting something in your case requires writing code plugging into accessibility APIs of desktop environment. If you know people wanting to contribute that with deepspeech, don't hesitate ... Also our english model might not yet be perfect, especially with non american accents.

@wohali I think Mycroft AI might have something for the speech recognition part.

(I just looked at their two great speech synthethisers)

@lanodan Still online, still mandatory connection to their service to use. :(

@wohali I thought Dragon had a Linux version a long time ago but maybe not

@feld yeah it's been a LONG time and it's not available for purchase anymore

@wohali Kaldi is not bad for many users, though it doesn't have the training depth of something like Dragon so you need to train it quite a bit more

Mozilla's DeepSpeech/pipsqueak is looking promising also github.com/mozilla/DeepSpeech

@darrenpmeyer I'll look at Kaldi, it looks like I'll need to train it a lot.

DeepSpeech still requires an online connection, they haven't gotten an offline install working yet, and it's very computationally intensive. Have my eye on it :)

@wohali yeah, I get why they're online for now but I'm definitely watching pipsqueak… I hope Kaldi works for you!

Sign in to participate in the conversation
Octodon

Octodon is a nice general purpose instance. more