This media is not supported in your browser
VIEW IN TELEGRAM
🍊Block-NeRF: Neural View Synthesis🍊
👉Large-scale scene reconstruction by multiple compact NeRFs that each fit into memory.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Berkeley + Google + Waymo = 🤯
✅Scaling NeRF to city-scale scenes
✅Trick: multiple simple NeRFs
✅Time decoupled, arbitrarily large scene
✅Data over months & different conditions
More: https://bit.ly/3GGVHBV
👉Large-scale scene reconstruction by multiple compact NeRFs that each fit into memory.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Berkeley + Google + Waymo = 🤯
✅Scaling NeRF to city-scale scenes
✅Trick: multiple simple NeRFs
✅Time decoupled, arbitrarily large scene
✅Data over months & different conditions
More: https://bit.ly/3GGVHBV
👍4🔥3🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🥬HW-Accelerated Neuro-Evolution🥬
👉Scalable, general purpose, hardware accelerated neuro-evolution toolkit by Google
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Parallel on multiple TPU/GPUs
✅Neuro-evo algorithms with NNs
✅WaterWorld, Abstract paint, more
✅From Google, not an official product
✅Code under Apache License 2.0
More: https://bit.ly/3szEi9w
👉Scalable, general purpose, hardware accelerated neuro-evolution toolkit by Google
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Parallel on multiple TPU/GPUs
✅Neuro-evo algorithms with NNs
✅WaterWorld, Abstract paint, more
✅From Google, not an official product
✅Code under Apache License 2.0
More: https://bit.ly/3szEi9w
👍3🔥2🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🚛 DeepETA: #Uber ETA via #AI🚛
👉Uber unveils the low-latency deep architecture for global ETA prediction
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Latency / Accuracy / Generality
✅7 NNs architectures tested
✅Encoder-decoder + Self-Attention
✅Linear transformer (kernel trick)
✅Feature sparsity for speed
More: https://bit.ly/3gFWmJh
👉Uber unveils the low-latency deep architecture for global ETA prediction
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Latency / Accuracy / Generality
✅7 NNs architectures tested
✅Encoder-decoder + Self-Attention
✅Linear transformer (kernel trick)
✅Feature sparsity for speed
More: https://bit.ly/3gFWmJh
👍3🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
✏️CLIPasso: Semantic Sketching via CLIP✏️
👉Sketching method guided by geometric and semantic simplifications (CLIP)
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅EPFL, TAU and IDC Herzliya
✅CLIP image encoder for sketching
✅Sketching as a set of Bezier curves
✅Param-optimization on CLIP-loss
✅Source code and models available
More: https://bit.ly/3oLEDF4
👉Sketching method guided by geometric and semantic simplifications (CLIP)
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅EPFL, TAU and IDC Herzliya
✅CLIP image encoder for sketching
✅Sketching as a set of Bezier curves
✅Param-optimization on CLIP-loss
✅Source code and models available
More: https://bit.ly/3oLEDF4
🔥2🥰2👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🪂SAHI: slicing detection/segmentation🪂
👉An open-source lightweight library for large scale object detection & instance segmentation
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Slicing Aided Hyper Inference
✅Large-scale detection/segment.
✅Sliced inference and merging
✅Utils for conversion, slicing, etc.
✅Code licensed under MIT License
More: https://bit.ly/3uMJoBZ
👉An open-source lightweight library for large scale object detection & instance segmentation
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Slicing Aided Hyper Inference
✅Large-scale detection/segment.
✅Sliced inference and merging
✅Utils for conversion, slicing, etc.
✅Code licensed under MIT License
More: https://bit.ly/3uMJoBZ
🔥3❤2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎁100,000,000 image-text pairs!🎁
👉Large-scale Chinese cross-modal dataset for benchmarking different multi-modal pre-training methods.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅100 Million <image, text> pairs
✅>200px size, aspect ratio (1/3~3)
✅Models of ResNet, ViT & SwinT
✅Methods: CLIP, FILIP and LiT
✅Privacy/Sensitive words 🤔
More: https://bit.ly/34BqlzX
👉Large-scale Chinese cross-modal dataset for benchmarking different multi-modal pre-training methods.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅100 Million <image, text> pairs
✅>200px size, aspect ratio (1/3~3)
✅Models of ResNet, ViT & SwinT
✅Methods: CLIP, FILIP and LiT
✅Privacy/Sensitive words 🤔
More: https://bit.ly/34BqlzX
👍5🤔1
This media is not supported in your browser
VIEW IN TELEGRAM
🧁33 Million synthetic pedestrians🧁
👉A novel large, fully synthetic dataset
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Exploiting the #gta5 engine
✅764 full-HD videos @20 fps
✅33M+ person instances
✅BBs & segmentation masks
✅2D/3D keypoints & depth
More: https://bit.ly/36njlY1
👉A novel large, fully synthetic dataset
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Exploiting the #gta5 engine
✅764 full-HD videos @20 fps
✅33M+ person instances
✅BBs & segmentation masks
✅2D/3D keypoints & depth
More: https://bit.ly/36njlY1
👍6🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🥝Marker-free 6D-point tracking🥝
👉Full position and rotation of skeletal joints, with only a RGB frame
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Full 3-axis joint rotations
✅V-markers, emulating mocap
✅#3D from monocular with NN
✅Generalization, no retraining
✅SOTA rotation/position est.
More: https://bit.ly/34GdoF5
👉Full position and rotation of skeletal joints, with only a RGB frame
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Full 3-axis joint rotations
✅V-markers, emulating mocap
✅#3D from monocular with NN
✅Generalization, no retraining
✅SOTA rotation/position est.
More: https://bit.ly/34GdoF5
🔥12🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🧼 Synthetic dataset for #Retail 🧼
👉A large-scale photorealistic synthetic dataset with annotations for semantic segmentation, instance segmentation, depth estimation, and object detection.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Dataset from Standard.AI
✅2,134 unique scenes
✅25k+ annotated samples
✅Introducing the "change detection"
✅Multi-view representation learning
✅NonCommercial-ShareAlike 4.0
More: https://bit.ly/3uXqubB
👉A large-scale photorealistic synthetic dataset with annotations for semantic segmentation, instance segmentation, depth estimation, and object detection.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Dataset from Standard.AI
✅2,134 unique scenes
✅25k+ annotated samples
✅Introducing the "change detection"
✅Multi-view representation learning
✅NonCommercial-ShareAlike 4.0
More: https://bit.ly/3uXqubB
🤯6🥰3👍1🔥1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🌈 Graph Neural Nets Forecasting🌈
👉Data-driven approach for forecasting global weather using graph neural networks
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Data-driven forecasting via GNNs
✅Model: 6.7M parameters, float32
✅6-hours forecast in 0.04 secs.
✅A 5-day forecast in 0.8 secs.
More: https://bit.ly/3LH4CXR
👉Data-driven approach for forecasting global weather using graph neural networks
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Data-driven forecasting via GNNs
✅Model: 6.7M parameters, float32
✅6-hours forecast in 0.04 secs.
✅A 5-day forecast in 0.8 secs.
More: https://bit.ly/3LH4CXR
👏4👍2🤔1
Media is too big
VIEW IN TELEGRAM
🥫Watch Those Words!🥫
👉Berkeley unveils a novel approach to discover cheap-fake and visually persuasive deep-fakes
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Regardless of falsification
✅Semantic person-specific
✅Word-conditioned analysis
✅Generalization across fakes
More: https://bit.ly/3oXWmcd
👉Berkeley unveils a novel approach to discover cheap-fake and visually persuasive deep-fakes
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Regardless of falsification
✅Semantic person-specific
✅Word-conditioned analysis
✅Generalization across fakes
More: https://bit.ly/3oXWmcd
👍5😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🔋V2X-sim for #selfdriving is out!🔋
👉V2X: collaboration between a vehicle and any surrounding entity
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Suitable for #selfdrivingcars
✅Rec. from road & vehicles
✅Multi-streams/perception
✅Detection, tracking, & segmentation
✅RGB, depth, semantic, BEV & LiDAR
More: https://bit.ly/3H6veOI
👉V2X: collaboration between a vehicle and any surrounding entity
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Suitable for #selfdrivingcars
✅Rec. from road & vehicles
✅Multi-streams/perception
✅Detection, tracking, & segmentation
✅RGB, depth, semantic, BEV & LiDAR
More: https://bit.ly/3H6veOI
🔥6🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🍏Infinite Synthetic dataset for Fitness🍏
👉Opensource synthetic images for fitness, single/multi-person, and realistic variation in lighting, camera angles, and occlusions
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅60k images, 1-5 avatars
✅15 categories, 21 variations
✅Blender and ray-tracing
✅SMPL-X + facial expression
✅Cloth/skin tone sampled
✅147 4K HDRI panoramas
✅Creative Commons 4.0
More: https://bit.ly/33B1R9q
👉Opensource synthetic images for fitness, single/multi-person, and realistic variation in lighting, camera angles, and occlusions
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅60k images, 1-5 avatars
✅15 categories, 21 variations
✅Blender and ray-tracing
✅SMPL-X + facial expression
✅Cloth/skin tone sampled
✅147 4K HDRI panoramas
✅Creative Commons 4.0
More: https://bit.ly/33B1R9q
🤩5❤1👍1
This media is not supported in your browser
VIEW IN TELEGRAM
♊ DITTO: Digital Twins from Interaction ♊
👉Digitizing objects for #metaverse through interactive perception
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅DIgital Twin of arTiculated Objects
✅Geometry & kinematic articulation
✅Articulation & 3D via perception
✅Source code under MIT License
More:https://bit.ly/3LMazCV
👉Digitizing objects for #metaverse through interactive perception
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅DIgital Twin of arTiculated Objects
✅Geometry & kinematic articulation
✅Articulation & 3D via perception
✅Source code under MIT License
More:https://bit.ly/3LMazCV
🔥5❤2👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🤖 Robotic Telekinesis from Youtube 🤖
👉CMU unveils a Robot that observes humans and imitates their actions in real-time
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Enabling robo-hand teleoperation
✅Suitable for untrained operator
✅Single uncalibrated RGB camera
✅Leveraging unlabeled #youtube
✅No active fine-tuning or setup
✅No collision via Adv-Training
More: https://bit.ly/3H7zUnh
👉CMU unveils a Robot that observes humans and imitates their actions in real-time
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Enabling robo-hand teleoperation
✅Suitable for untrained operator
✅Single uncalibrated RGB camera
✅Leveraging unlabeled #youtube
✅No active fine-tuning or setup
✅No collision via Adv-Training
More: https://bit.ly/3H7zUnh
🔥3🤯2👍1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
💄DIGAN: #AI for video generation💄
👉A novel INR-based generative adversarial network for video generation
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Dynamics-aware generator
✅INR-based clip generator
✅Manipulating space/time
✅Identifying unnatural motion
More: https://bit.ly/3H6sHE4
👉A novel INR-based generative adversarial network for video generation
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Dynamics-aware generator
✅INR-based clip generator
✅Manipulating space/time
✅Identifying unnatural motion
More: https://bit.ly/3H6sHE4
🔥4🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🦄FILM Neural Frame Interpolation🦄
👉Frame interpolation that synthesizes multiple intermediate frames from two input images with large in-between motion
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Single unified network
✅High quality output
✅SOTA on the Xiph
✅Apache License 2.0
More: https://bit.ly/3pl4ZxH
👉Frame interpolation that synthesizes multiple intermediate frames from two input images with large in-between motion
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Single unified network
✅High quality output
✅SOTA on the Xiph
✅Apache License 2.0
More: https://bit.ly/3pl4ZxH
🔥5👍2🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🔈Neural Maintenance via listening🔈
👉Novel neural-method to detect whether a machine is "healthy" or requires maintenance
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Defects at an early stage
✅FDWT, fast discrete wavelet
✅Learnable wavelet/denoising
✅Unsupervised learnable FDWT
✅The new SOTA in PM
More: https://bit.ly/3hiKWeX
👉Novel neural-method to detect whether a machine is "healthy" or requires maintenance
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Defects at an early stage
✅FDWT, fast discrete wavelet
✅Learnable wavelet/denoising
✅Unsupervised learnable FDWT
✅The new SOTA in PM
More: https://bit.ly/3hiKWeX
🤯6🤔1
This media is not supported in your browser
VIEW IN TELEGRAM
🟦🟨 StyleGAN on Internet pics 🟦🟨
👉StyleGAN on raw uncurated images collected from Internet
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Outliers & multi-modal
✅Self-distillation approach
✅Self-filtering of outliers
✅Perceptual clustering
More: https://bit.ly/33Z1d5H
👉StyleGAN on raw uncurated images collected from Internet
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Outliers & multi-modal
✅Self-distillation approach
✅Self-filtering of outliers
✅Perceptual clustering
More: https://bit.ly/33Z1d5H
❤2👍1🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🦜The new SOTA for Unsupervised 🦜
👉Self-supervised transformer to discover objects in images
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Visual tokens as nodes in graph
✅Edges as connectivity score
✅The second smallest eV = fg
✅Suitable for unsupervised saliency
✅Weakly supervised obj. detection
✅Code under MIT License
More: https://bit.ly/3sqbFg3
👉Self-supervised transformer to discover objects in images
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Visual tokens as nodes in graph
✅Edges as connectivity score
✅The second smallest eV = fg
✅Suitable for unsupervised saliency
✅Weakly supervised obj. detection
✅Code under MIT License
More: https://bit.ly/3sqbFg3
👍4🔥3🤯1