Vol Building AGI – Telegram

Vol Building AGI

580 subscribers

116 photos

9 videos

12 files

199 links

Past topics: speech synthesis, transformers, LSTM, recurrence

Download Telegram

About

Blog

Apps

Platform

Vol Building AGI

580 subscribers

Vol Building AGI

34 views08:53

Vol Building AGI

34 views08:53

Vol Building AGI

Media is too big

VIEW IN TELEGRAM

34 views08:55

Vol Building AGI

35 views08:55

Vol Building AGI

How to augment speech content (likely usable as recognition augmentations too)

35 views09:01

Vol Building AGI

Vol Building AGI

signal processing revealed

34 views09:03

Vol Building AGI

VTLP code: https://github.com/biggytruck/SpeechSplit2/blob/0911c09732e0e935c7c0a7aaf23eb2923d9889d8/utils.py#L252-L276

SpeechSplit2/utils.py at 0911c09732e0e935c7c0a7aaf23eb2923d9889d8 · biggytruck/SpeechSplit2

Official implementation of SpeechSplit2. Contribute to biggytruck/SpeechSplit2 development by creating an account on GitHub.

34 views09:09

Vol Building AGI

New SOTA on TTS from Microsoft Research Asia (outside of ICASSP)

Uses 24 hours (13100 utterances) from LJSpeech, 200M text sentences for phoneme encoder pretraining and a g2p model. 8 V100 GPUs. 3000 epochs.

https://speechresearch.github.io/naturalspeech/

33 views11:24

Vol Building AGI

In the mean time all Interspeech 2021 videos have been made available https://www.superlectures.com/interspeech2021/tutorials

https://www.youtube.com/channel/UC2-z0HD4WpSbJONj73BgfwQ/videos

33 viewsedited 14:23

Vol Building AGI

https://www.youtube.com/watch?v=-p_awLZWLeI

https://github.com/facebookresearch/vocoder-benchmark

VocBench from Facebook

Autoregressive vocoders: WaveNet, WaveRNN
GANs: Parallel WaveGAN, MelGAN
Diffusion: WaveGrad, DiffWave

All in one place with a common input-output interface with modern codebase from Facebook.

Might be useful for VC if it’s easy to make condition those vocoders using custom features.

36 viewsedited 14:53

Vol Building AGI

Neural HMM: learns alignments fast

https://shivammehta007.github.io/Neural-HMM/

Promises to converge with 500 utterances, i couldn’t get it to work with that much data. I think with 2k utterances it should.

36 views15:20

Vol Building AGI

37 views15:20

Vol Building AGI

39 views16:20

Vol Building AGI

https://github.com/mindslab-ai/assem-vc

GitHub - maum-ai/assem-vc: Official Code for Assem-VC @ICASSP2022

Official Code for Assem-VC @ICASSP2022. Contribute to maum-ai/assem-vc development by creating an account on GitHub.

36 views18:31

Vol Building AGI

36 views18:31

Vol Building AGI

tg_image_3087241015.jpeg

35 views18:31

Vol Building AGI

33 views02:21

Vol Building AGI

33 views02:22

Vol Building AGI

Prosody annotations for Switchboard: https://groups.inf.ed.ac.uk/switchboard/index.html

49 views03:17

Vol Building AGI

Vol Building AGI

Neural Text to Speech Synthesis Tutorial

https://github.com/tts-tutorial/icassp2022

Survey paper: https://arxiv.org/abs/2106.15561

GitHub - tts-tutorial/icassp2022

Contribute to tts-tutorial/icassp2022 development by creating an account on GitHub.

35 views04:22

Vol Building AGI

Convolutional Pitch Tracker (ICASSP 2018)

https://marl.github.io/crepe/

PyTorch port with lots of usage details: https://github.com/maxrmorrison/torchcrepe

31 viewsedited 16:05