ComfyUI-DramaBox now supports Loras and Voice-Clone-Studio-DramaBox can generate them.
Hey guys, a couple of days ago u/manmaynakhashi released DramaBox.
A really cool TTS model based on LTX.
I had made a **ComfyUI node** for it and today I've added Lora support.
Some of you might be familiar with my TTS tool, Voice-Clone-Studio.
I made a stripped down version called **Voice-Clone-Studio-DramaBox**, specifically for DramaBox, both for using it as a TTS and Lora Generation.
I've stripped out most of the models, only keeping Qwen-TTS for it`s Voice Design option. This makes it a bit more focused and easier to install.
In it you will find a Prep Sample tab that allow for generating complete Datasets from one long audio clip. As it will cut it down by phrases and auto transcribe it.
https://preview.redd.it/gpqqywzkol1h1.png?width=1901&format=png&auto=webp&s=a7418431c0ba0ff1399fdd13585ee4b02cb119a3
I've add better success with 10 clips, than when using 80. With clips ranging between 5 to 10 seconds.
Had DramaBox is VERY prone to hallucination, I'm not adding it to Voice-Clone-Studio. It serves a different use case. This is much more experimental 🤣
https://redd.it/1tfbjfo
@rStableDiffusion
Hey guys, a couple of days ago u/manmaynakhashi released DramaBox.
A really cool TTS model based on LTX.
I had made a **ComfyUI node** for it and today I've added Lora support.
Some of you might be familiar with my TTS tool, Voice-Clone-Studio.
I made a stripped down version called **Voice-Clone-Studio-DramaBox**, specifically for DramaBox, both for using it as a TTS and Lora Generation.
I've stripped out most of the models, only keeping Qwen-TTS for it`s Voice Design option. This makes it a bit more focused and easier to install.
In it you will find a Prep Sample tab that allow for generating complete Datasets from one long audio clip. As it will cut it down by phrases and auto transcribe it.
https://preview.redd.it/gpqqywzkol1h1.png?width=1901&format=png&auto=webp&s=a7418431c0ba0ff1399fdd13585ee4b02cb119a3
I've add better success with 10 clips, than when using 80. With clips ranging between 5 to 10 seconds.
Had DramaBox is VERY prone to hallucination, I'm not adding it to Voice-Clone-Studio. It serves a different use case. This is much more experimental 🤣
https://redd.it/1tfbjfo
@rStableDiffusion
Reddit
Check out manmaynakhashi’s Reddit profile
Explore manmaynakhashi’s posts and comments on Reddit
LTX 2.3 is now supported in Comfyui-Mesh for splitting models across Ethernet or multigpu machines with Nvenc codec. Major vram fixes included for flux2/LTX model implementations in the node.
https://redd.it/1tfcj56
@rStableDiffusion
https://redd.it/1tfcj56
@rStableDiffusion