wavlm

Here are 12 public repositories matching this topic...

yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

text-to-speech deep-learning pytorch tts speech-synthesis gan speaker-adaptation adversarial-training diffusion-models wavlm latent-diffusion latent-diffusion-models

Updated Aug 10, 2024
Python

s3prl / s3prl

Star

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Updated Nov 16, 2024
Python

wenet-e2e / wespeaker

Star

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Updated Nov 14, 2024
Python

mjhydri / Singing-Vocal-Beat-Tracking

Star

This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and trains linear multi-head self-attention layers on top of them to extract vocal beat activations. Then, it uses HMM decoder to infer signing beats and t…

music music-information-retrieval beat-tracking self-supervised singing-voice hubert linear-transformer wavlm

Updated Sep 4, 2022
Python

lucadellalib / discrete-wavlm-codec

Star

A neural speech codec based on discrete WavLM representations

clustering pytorch speech-synthesis codec k-means quantization self-supervised-learning hifi-gan wavlm token-extraction neural-speech-coding

Updated Aug 28, 2024
Python

alessandropec / data_driven_ai_voice_cloning

Star

This repository contain the code of the main part of my master thesis degree at Politecnico di Torino in Data science & Engineering

machine-learning text-to-speech ai deep-learning speaker-verification zero-shot-learning speaker-embeddings voice-cloning tacotron2 fastspeech2 ecapa-tdnn wavlm generative-ai

Updated Mar 5, 2023
Python

Sarasadeghii / Sharif-WavLM

Star

In this repository, the wavLM model is used for quality and poor quality data for speaker verification task, and the PyCM library is used for evaluation.

confusion-matrix speaker-verification farsi-datasets wavlm pycm

Updated May 27, 2023
Jupyter Notebook

zhu00121 / Universal-representation-dynamics-of-deepfake-speech

Star

This repo contains code used in the paper "Characterizing the temporal dynamics of universal speech representations for generalizable deepfake detection"

self-supervised deepfake-detection wav2vec2 wavlm modulation-transformation