jishengpeng/WavTokenizer
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
Language: Python
#acoustic #audio_representation #codec #dac #encodec #gpt4o #music_representation_learning #semantic #soundstream #speech_language_model #speech_representation #text_to_speech
Stars: 332 Issues: 6 Forks: 20
https://github.com/jishengpeng/WavTokenizer
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
Language: Python
#acoustic #audio_representation #codec #dac #encodec #gpt4o #music_representation_learning #semantic #soundstream #speech_language_model #speech_representation #text_to_speech
Stars: 332 Issues: 6 Forks: 20
https://github.com/jishengpeng/WavTokenizer
GitHub
GitHub - jishengpeng/WavTokenizer: SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling - GitHub - jishengpeng/WavTokenizer: SOTA discrete acoustic codec models with 40 tokens per second for aud...
ictnlp/LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Language: Python
#large_language_models #multimodal_large_language_models #speech_interaction #speech_language_model #speech_to_speech #speech_to_text
Stars: 274 Issues: 1 Forks: 16
https://github.com/ictnlp/LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Language: Python
#large_language_models #multimodal_large_language_models #speech_interaction #speech_language_model #speech_to_speech #speech_to_text
Stars: 274 Issues: 1 Forks: 16
https://github.com/ictnlp/LLaMA-Omni
GitHub
GitHub - ictnlp/LLaMA-Omni: LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1…
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level. - ictnlp/LLaMA-Omni