[clug] Speech-to-text on Ubuntu
Kathy Reid
kathy at kathyreid.id.au
Fri Jan 7 00:18:20 UTC 2022
OK, that's super useful.
Some guidance here:
- PyAudio is for recording and playing audio; it has no in-built STT
capabilities
- CMU Sphinx is no longer supported and is difficult to get running
- Kaldi is better supported, but again difficult to get running, and
requires a bunch of setup scripts - it's not "sudo apt install kaldi".
- If you are comfortable in Python, DeepSpeech has pre-trained models,
but is no longer supported.
- The new kid on the block is Coqui - pre-trained models available
- None of these provide an easy to use interface - you will need to have
some sort of pipeline for recording audio, segmenting it into 5-15
second chunks, and then running these through an STT engine.
More broadly, open source speech to text / automatic speech recognition
is challenging at the moment - many projects are abandoned, and none
provide a useful, helpful interface where you can just drop a recording
and get a script back. You can expect maybe 90-92% word error rate from
the above (they don't handle Australian accents well), so expect to
spend a fair bit of time correcting transcripts that are generated. So
setting this up will take some effort.
Best, Kathy
On 7/1/22 11:02 am, jhock at iinet.net.au wrote:
> I want to say many things and I want what I say be converted into text sentences. I will then edit it in a text editor or Libre Office. For example:
>
> "When fencing, it's best to use a forked straining yoke to prevent small indentations on the wire that would be caused by a mechanical, gripping fence strainer."
>
> I'm less likely to do the typing because it's more work for me. Hence the speech to text enquiry. :-)
>
> On 7 January 2022 8:41:43 am AEDT, Kathy Reid via linux <linux at lists.samba.org> wrote:
>> What's the use case - as different packages have different strengths and
>> weaknesses?
>>
>> Best, Kathy
>>
>> On 6/1/22 5:17 pm, jhock--- via linux wrote:
>>> Does anyone know any speech-to-text software that I can install onto Ubuntu 20.04 using 'sudo apt install' or similar commands?
>>>
>>> I've seen the python: python3-pyaudio . Is that the best to try?
>>>
More information about the linux
mailing list