Textless NLP: Generating expressive speech from raw audio
Generative Spoken Language Model (GSLM), the first high-performance NLP model that leverages recent breakthroughs in representation learning, allowing it to work directly from only raw audio signals, without any labels or text.
Link
#deeplearning #nlp #selfsupervised #facebook
@pythonicAi
Generative Spoken Language Model (GSLM), the first high-performance NLP model that leverages recent breakthroughs in representation learning, allowing it to work directly from only raw audio signals, without any labels or text.
Link
#deeplearning #nlp #selfsupervised #facebook
@pythonicAi
Meta
Textless NLP: Generating expressive speech from raw audio
We’re introducing GSLM, the first language model that breaks free completely of the dependence on text for training. This “textless NLP” approach learns to generate expressive speech using only raw audio recordings as input.