non-latin text is not aligned #1046

beviah · 2023-04-27T22:48:17Z

i.e. arabic, russian, etc.

text field in response has valid non-latin transcript, yet there are no alignments.

pzelasko · 2023-04-27T22:58:56Z

Can you provide more context?

beviah · 2023-04-27T23:12:42Z

from lhotse import CutSet, RecordingSet, align_with_torchaudio, annotate_with_whisper
recordings = RecordingSet.from_dir(fld, pattern="*.wav")
cuts = annotate_with_whisper(recordings, device='cuda', language='ar')
cuts_aligned = align_with_torchaudio(cuts)
for cut in cuts_aligned:
    alignments = cut.supervisions

word alignments are always empty for texts of non-latin scripts, as if no text was detected.

pzelasko · 2023-04-27T23:26:33Z

You’d need an ASR model that supports your target language. You can check if there’s one available in torchaudio: https://pytorch.org/audio/stable/pipelines.html

Otherwise you’d need to train or fine tune your own. You can also try word alignments from faster whisper (unmerged yet in #1017).

beviah · 2023-04-28T00:01:05Z

Awesome! it works, thanks!

beviah closed this as completed Apr 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

non-latin text is not aligned #1046

non-latin text is not aligned #1046

beviah commented Apr 27, 2023

pzelasko commented Apr 27, 2023

beviah commented Apr 27, 2023

pzelasko commented Apr 27, 2023

beviah commented Apr 28, 2023

non-latin text is not aligned #1046

non-latin text is not aligned #1046

Comments

beviah commented Apr 27, 2023

pzelasko commented Apr 27, 2023

beviah commented Apr 27, 2023

pzelasko commented Apr 27, 2023

beviah commented Apr 28, 2023