Data Science by ODS.ai 🦜

🎙🎶Improved audio generative model from OpenAI

Wow! OpenAI just released Jukebox – neural net and service that generates music from genre, artist name, and some lyrics that you can supply. It is can generate even some singing like from corrupted magnet compact cassette.

Some of the sounds seem it is from hell. Agonizing Michel Jakson for example or Creepy Eminiem or Celien Dion

#OpenAI 's approach is to use 3 levels of quantized variational autoencoders VQVAE-2 to learn discrete representations of audio and compress audio by 8x, 32x, and 128x and use the spectral loss to reconstruct spectrograms. And after that, they use sparse transformers conditioned on lyrics to generate new patterns and upsample it to higher discrete samples and decode it to the song.

The net can even learn and generates some solo parts during the track.

explore some creepy songs: https://jukebox.openai.com/
code: https://github.com/openai/jukebox/
paper: https://cdn.openai.com/papers/jukebox.pdf
blog: https://openai.com/blog/jukebox/

#openAI #music #sound #cool #fan #creepy #vae #audiolearning #soundlearning

0:26

12.3K views09:34

Data Science by ODS.ai 🦜

Castle in the Sky

Dynamic Sky Replacement and Harmonization in Videos

Fascinating and ready to be applied for work. (With colab notebook)
The authors proposed a method to replace the sky in the video that works well in high resolution. The results are very impressive. The method runs in real-time and produces video almost without glitches and artifacts. Also, can generate for example lightning and glow on target video.
The pipeline is quite complicated and contains several tasks:
– A sky matting network to segmentation sky on video frames
– A motion estimator for sky objects
– A skybox for blending where sky and other environments on video are relighting and recoloring.
Authors say their work, in a nutshell, proposes a new framework for sky augmentation in outdoor videos. The solution is purely vision-based and it can be applied to both online and offline scenarios.
But let's take a closer look.

A sky matting module is a ResNet-like encoder and several layers upsampling decoder to solve sky pixel-wise segmentation tasks followed by a refinement stage with guided image filtering.
A motion estimator directly estimates the motion of the objects in the sky. The motion patterns are modeled by an affine matrix and optical flow.
The sky image blending module is a decoder that models a linear combination of target sky matte and aligned sky template.

Overall, the network architecture is ResNet-50 as encoder and decoder with coordConv upsampling layers with skip connections and implemented in Pytorch,

The result is presented in a very cool video https://youtu.be/zal9Ues0aOQ

site: https://jiupinjia.github.io/skyar/
paper: https://arxiv.org/abs/2010.11800
github: https://github.com/jiupinjia/SkyAR

#sky #CV #video #cool #resnet

YouTube

Dynamic Sky Replacement and Harmonization in Videos

Preprint: Castle in the Sky: Zhengxia Zou, Dynamic Sky Replacement and Harmonization in Videos, 2020.
Project page: https://jiupinjia.github.io/skyar/

16.9K viewsedited 10:56

☁️ 6 🛸 20 ⚡️ 15

About

Blog

Apps

Platform