AI with Papers - Artificial Intelligence & Deep Learning
15.4K subscribers
140 photos
253 videos
14 files
1.33K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
đŸ§¯Long Video Diffusion ModelsđŸ§¯

👉#Google unveils a novel diffusion model for video generation

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Straightforward extension of 2D UNet
✅Longer by new conditional generation
✅SOTA in unconditional generation

More: https://bit.ly/35Y2rzg
đŸ”Ĩ4🎉2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🚙 AutoRF: #3D objects in-the-wild 🚙

👉From #Meta: #3D object from just a single, in-the wild, image

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Novel view synthesis from in-the-wild
✅Normalized, object-centric representation
✅Disentangling shape, appearance & pose
✅Exploiting BBS & panoptic segmentation
✅Shape/appearance properties for objects


More: https://bit.ly/3O4ONeQ
đŸ¤¯7😱2đŸ”Ĩ1
This media is not supported in your browser
VIEW IN TELEGRAM
🌠GAN-based Darkest Dataset🌠

👉Berkeley + #Intel announce first photorealistic dataset under starlight (no moon, <0.001 lx)

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅"Darkest" dataset ever seen
✅Moonless, no external illumination
✅GAN-tuned physics-based model
✅Clips with dancing, volleyball, flags...

More: https://bit.ly/3LXxMkN
👍3đŸ¤¯2đŸ”Ĩ1
This media is not supported in your browser
VIEW IN TELEGRAM
🤖Populating with digital humans🤖

👉ETHZ unveils GAMMA to populate the #3D scene with digital humans

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅GenerAtive Motion primitive MArkers
✅Realistic, controllable, infinite motions
✅Tree-based search to preserve quality
✅SOTA in realistic/controllable motion

More: https://bit.ly/3OgY4AG
😱5👍4đŸ”Ĩ2👏1đŸ¤¯1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸ”Ĩ#AIwithPapers: we are ~2,000!đŸ”Ĩ

💙💛 Simply amazing. Thank you all 💙💛

😈 Invite your friends -> https://t.me/AI_DeepLearning
❤18đŸ”Ĩ8đŸĨ°4👍3
This media is not supported in your browser
VIEW IN TELEGRAM
đŸ˜ŧGARF: Gaussian Activated NeRFđŸ˜ŧ

👉GARF: Gaussian Activated R.F. for Hi-Fi reconstruction/pose

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅NeRF from imperfect camera poses
✅NO hyper-parameter tuning/initialization
✅Theoretical insight on Gaussian activation
✅Unlocking NeRF for real-world application?

More: https://bit.ly/36bvdfU
👍4🤩2❤1👏1đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎭Novel pre-training strategy for #AI🎭

👉EPFL unveils the Multi-modal Multi-task Masked Autoencoders (MultiMAE)

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Multimodal: additional modal. over RGB
✅Multi-task: multiple outputs over RGB
✅General: MultiMAE by pseudo-labeling
✅Classification, segmentation, depth
✅Code under NonCommercial 4.0 Int.

More: https://bit.ly/3jRhNsN
đŸ”Ĩ7đŸ¤¯2👍1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸ§Ē A new SOTA in Dataset Distillation đŸ§Ē

👉A new approach by Matching Training Trajectories is out!

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Distilling data "to match" bigger one
✅Distilled data to guide a network
✅Trajectories of experts from real data
✅SOTA + distilling higher-res visual data

More: https://bit.ly/3JwYOxW
👍5đŸ”Ĩ1đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
🧤 Two-Hand tracking via GCN 🧤

👉The first-ever GCN for two interacting hands in single RGB image

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Reconstruction by GCN mesh regression
✅PIFA: pyramid attention for local occlusion
✅CHA: cross hand attention for interaction
✅SOTA + generalization in-the-wild scenario
✅Source code available under GNU đŸ¤¯

More: https://bit.ly/3KH5FWO
👏10👍4đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸ•šī¸Video K-Net, SOTA in SegmentationđŸ•šī¸

👉Simple, strong, and unified framework for fully end-to-end video panoptic segmentation

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Learnable kernels from K-Net
✅K-Net learns to segment & track
✅Appearance / cross-T kernel interaction
✅New SOTA without bells and whistles đŸ¤ˇâ€â™‚ī¸

More: https://bit.ly/3uEEZQR
👍6đŸ”Ĩ1đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐭DeepLabCut: tracking animals in the wild🐭

👉A toolbox for markerless pose estimation of animals performing various tasks

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Multi-animal pose estimation
✅Datasets for multi-animal pose
✅Key-points, limbs, animal identity
✅Optimal key-points without input

More: https://bit.ly/37L1mLE
đŸ”Ĩ6🤔4👏2đŸ¤¯2❤1👍1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🍡Neural Articulated Human Body🍡

👉Novel neural implicit representation for articulated body

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅COmpositional Articulated People
✅Large variety of shapes & poses
✅Novel encoder-decoder architecture

More: https://bit.ly/3xvn7dl
👍4đŸĨ°2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĻš 2K Resolution Generative #AI đŸĻš

👉Novel continuous-scale training with variable output resolutions

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Mixed-resolution data
✅Arbitrary scales during training
✅Generations beyond 1024×1024
✅Variant of FID metric for scales
✅Source code under MIT license

More: https://bit.ly/3uNfVY6
đŸ¤¯11👍2đŸ”Ĩ2😱1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🐍DS Unsupervised Video Decomposition🐍

👉Novel method to extract persistent elements of a scene

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Scene element as Deformable Sprite (DS)
✅Deformable Sprites by video auto-encoder
✅Canonical texture image for appearance
✅Non-rigid geom. transformation

More: https://bit.ly/37WV9w1
👍4đŸ¤¯3đŸ”Ĩ1đŸĨ°1👏1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĨ“ L-SVPE for Deep Deblurring đŸĨ“

👉L-SVPE to deblur scenes while recovering high-freq details

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Learned Spatially Varying Pixel Exposures
✅Next-gen focal-plane sensor + DL
✅Deep conv decoder for motion deblurring
✅Superior results over non-optimized exp.

More: https://bit.ly/3uRYQMT
🤩7👍2🤔2🎉1
This media is not supported in your browser
VIEW IN TELEGRAM
🧧Hyper-Fast Instance Segmentation🧧

👉Novel Temporally Efficient Vision Transformer (TeViT) for VIS

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Video instance segmentation transformer
✅Contextual-info at frame/instance level
✅Nearly convolution-free framework đŸ¤ˇâ€â™‚ī¸
✅The new SOTA for VIS, ~70 FPS!
✅Code & models under MIT license

More: https://bit.ly/3rCMXIn
đŸ”Ĩ10👍3👏1đŸ¤¯1
📗Unified Scene Text/Layout Detection📗

👉World's first hierarchical scene text dataset + novel detection method

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Unified detection & geometric layout
✅Hierarchical annotations in natural scenes
✅Word, line, & paragraph level annotations
✅Source under CC Attribution Share Alike 4.0

More: https://bit.ly/3jRpezV
đŸ”Ĩ3đŸ¤¯2❤1👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🙌 #Oculus' new Hand Tracking 🙌

👉Hands are able to move as naturally and intuitively in the #metaverse as do in real life

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Hands2.0 powered by CV & ML
✅Tracking hand-over-hand interactions
✅Crossing hands, clapping, high-fives
✅Accurate thumbs-up gesture

More: https://bit.ly/3JXPvY2
đŸ¤¯6❤4👍2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸŽ—ī¸New SOTA in #3D human avatarđŸŽ—ī¸

👉PHORHUM: photorealistic 3D human from mono-RGB

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Pixel-aligned method for 3D geometry
✅Unshaded surface color + illumination
✅Patch-based rendering losses for visible
✅Plausible color estimation for non-visible

More: https://bit.ly/3MkvBrA
đŸ¤¯4👍2đŸĨ°2❤1