AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
đŸ§¯Long Video Diffusion ModelsđŸ§¯

👉#Google unveils a novel diffusion model for video generation

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Straightforward extension of 2D UNet
✅Longer by new conditional generation
✅SOTA in unconditional generation

More: https://bit.ly/35Y2rzg
đŸ”Ĩ4🎉2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🚙 AutoRF: #3D objects in-the-wild 🚙

👉From #Meta: #3D object from just a single, in-the wild, image

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Novel view synthesis from in-the-wild
✅Normalized, object-centric representation
✅Disentangling shape, appearance & pose
✅Exploiting BBS & panoptic segmentation
✅Shape/appearance properties for objects


More: https://bit.ly/3O4ONeQ
đŸ¤¯7😱2đŸ”Ĩ1
This media is not supported in your browser
VIEW IN TELEGRAM
🌠GAN-based Darkest Dataset🌠

👉Berkeley + #Intel announce first photorealistic dataset under starlight (no moon, <0.001 lx)

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅"Darkest" dataset ever seen
✅Moonless, no external illumination
✅GAN-tuned physics-based model
✅Clips with dancing, volleyball, flags...

More: https://bit.ly/3LXxMkN
👍3đŸ¤¯2đŸ”Ĩ1
This media is not supported in your browser
VIEW IN TELEGRAM
🤖Populating with digital humans🤖

👉ETHZ unveils GAMMA to populate the #3D scene with digital humans

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅GenerAtive Motion primitive MArkers
✅Realistic, controllable, infinite motions
✅Tree-based search to preserve quality
✅SOTA in realistic/controllable motion

More: https://bit.ly/3OgY4AG
😱5👍4đŸ”Ĩ2👏1đŸ¤¯1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸ”Ĩ#AIwithPapers: we are ~2,000!đŸ”Ĩ

💙💛 Simply amazing. Thank you all 💙💛

😈 Invite your friends -> https://t.me/AI_DeepLearning
❤18đŸ”Ĩ8đŸĨ°4👍3
This media is not supported in your browser
VIEW IN TELEGRAM
đŸ˜ŧGARF: Gaussian Activated NeRFđŸ˜ŧ

👉GARF: Gaussian Activated R.F. for Hi-Fi reconstruction/pose

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅NeRF from imperfect camera poses
✅NO hyper-parameter tuning/initialization
✅Theoretical insight on Gaussian activation
✅Unlocking NeRF for real-world application?

More: https://bit.ly/36bvdfU
👍4🤩2❤1👏1đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎭Novel pre-training strategy for #AI🎭

👉EPFL unveils the Multi-modal Multi-task Masked Autoencoders (MultiMAE)

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Multimodal: additional modal. over RGB
✅Multi-task: multiple outputs over RGB
✅General: MultiMAE by pseudo-labeling
✅Classification, segmentation, depth
✅Code under NonCommercial 4.0 Int.

More: https://bit.ly/3jRhNsN
đŸ”Ĩ7đŸ¤¯2👍1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸ§Ē A new SOTA in Dataset Distillation đŸ§Ē

👉A new approach by Matching Training Trajectories is out!

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Distilling data "to match" bigger one
✅Distilled data to guide a network
✅Trajectories of experts from real data
✅SOTA + distilling higher-res visual data

More: https://bit.ly/3JwYOxW
👍5đŸ”Ĩ1đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
🧤 Two-Hand tracking via GCN 🧤

👉The first-ever GCN for two interacting hands in single RGB image

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Reconstruction by GCN mesh regression
✅PIFA: pyramid attention for local occlusion
✅CHA: cross hand attention for interaction
✅SOTA + generalization in-the-wild scenario
✅Source code available under GNU đŸ¤¯

More: https://bit.ly/3KH5FWO
👏10👍4đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸ•šī¸Video K-Net, SOTA in SegmentationđŸ•šī¸

👉Simple, strong, and unified framework for fully end-to-end video panoptic segmentation

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Learnable kernels from K-Net
✅K-Net learns to segment & track
✅Appearance / cross-T kernel interaction
✅New SOTA without bells and whistles đŸ¤ˇâ€â™‚ī¸

More: https://bit.ly/3uEEZQR
👍6đŸ”Ĩ1đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐭DeepLabCut: tracking animals in the wild🐭

👉A toolbox for markerless pose estimation of animals performing various tasks

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Multi-animal pose estimation
✅Datasets for multi-animal pose
✅Key-points, limbs, animal identity
✅Optimal key-points without input

More: https://bit.ly/37L1mLE
đŸ”Ĩ6🤔4👏2đŸ¤¯2❤1👍1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🍡Neural Articulated Human Body🍡

👉Novel neural implicit representation for articulated body

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅COmpositional Articulated People
✅Large variety of shapes & poses
✅Novel encoder-decoder architecture

More: https://bit.ly/3xvn7dl
👍4đŸĨ°2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĻš 2K Resolution Generative #AI đŸĻš

👉Novel continuous-scale training with variable output resolutions

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Mixed-resolution data
✅Arbitrary scales during training
✅Generations beyond 1024×1024
✅Variant of FID metric for scales
✅Source code under MIT license

More: https://bit.ly/3uNfVY6
đŸ¤¯11👍2đŸ”Ĩ2😱1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🐍DS Unsupervised Video Decomposition🐍

👉Novel method to extract persistent elements of a scene

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Scene element as Deformable Sprite (DS)
✅Deformable Sprites by video auto-encoder
✅Canonical texture image for appearance
✅Non-rigid geom. transformation

More: https://bit.ly/37WV9w1
👍4đŸ¤¯3đŸ”Ĩ1đŸĨ°1👏1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĨ“ L-SVPE for Deep Deblurring đŸĨ“

👉L-SVPE to deblur scenes while recovering high-freq details

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Learned Spatially Varying Pixel Exposures
✅Next-gen focal-plane sensor + DL
✅Deep conv decoder for motion deblurring
✅Superior results over non-optimized exp.

More: https://bit.ly/3uRYQMT
🤩7👍2🤔2🎉1
This media is not supported in your browser
VIEW IN TELEGRAM
🧧Hyper-Fast Instance Segmentation🧧

👉Novel Temporally Efficient Vision Transformer (TeViT) for VIS

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Video instance segmentation transformer
✅Contextual-info at frame/instance level
✅Nearly convolution-free framework đŸ¤ˇâ€â™‚ī¸
✅The new SOTA for VIS, ~70 FPS!
✅Code & models under MIT license

More: https://bit.ly/3rCMXIn
đŸ”Ĩ10👍3👏1đŸ¤¯1
📗Unified Scene Text/Layout Detection📗

👉World's first hierarchical scene text dataset + novel detection method

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Unified detection & geometric layout
✅Hierarchical annotations in natural scenes
✅Word, line, & paragraph level annotations
✅Source under CC Attribution Share Alike 4.0

More: https://bit.ly/3jRpezV
đŸ”Ĩ3đŸ¤¯2❤1👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🙌 #Oculus' new Hand Tracking 🙌

👉Hands are able to move as naturally and intuitively in the #metaverse as do in real life

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Hands2.0 powered by CV & ML
✅Tracking hand-over-hand interactions
✅Crossing hands, clapping, high-fives
✅Accurate thumbs-up gesture

More: https://bit.ly/3JXPvY2
đŸ¤¯6❤4👍2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸŽ—ī¸New SOTA in #3D human avatarđŸŽ—ī¸

👉PHORHUM: photorealistic 3D human from mono-RGB

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Pixel-aligned method for 3D geometry
✅Unshaded surface color + illumination
✅Patch-based rendering losses for visible
✅Plausible color estimation for non-visible

More: https://bit.ly/3MkvBrA
đŸ¤¯4👍2đŸĨ°2❤1