Cut and Learn for Unsupervised Object Detection and Instance Segmentation
Cut-and-LEaRn (CutLER) is a simple approach for training object detection and instance segmentation models without human annotations. It outperforms previous SOTA by 2.7 times for AP50 and 2.6 times for AR on 11 benchmarks.
Paper:
https://arxiv.org/pdf/2301.11320.pdf
Github:
https://github.com/facebookresearch/CutLER
Demo:
https://colab.research.google.com/drive/1NgEyFHvOfuA2MZZnfNPWg1w5gSr3HOBb?usp=sharing
๐@computer_science_and_programming
Cut-and-LEaRn (CutLER) is a simple approach for training object detection and instance segmentation models without human annotations. It outperforms previous SOTA by 2.7 times for AP50 and 2.6 times for AR on 11 benchmarks.
Paper:
https://arxiv.org/pdf/2301.11320.pdf
Github:
https://github.com/facebookresearch/CutLER
Demo:
https://colab.research.google.com/drive/1NgEyFHvOfuA2MZZnfNPWg1w5gSr3HOBb?usp=sharing
๐@computer_science_and_programming
๐99๐1
Audio AI Timeline
Here we will keep track of the latest AI models for audio generation, starting in 2023!
โช๏ธSingSong: Generating musical accompaniments from singing
- Paper
โช๏ธAudioLDM: Text-to-Audio Generation with Latent Diffusion Models
- Paper
- Code
โช๏ธMoรปsai: Text-to-Music Generation with Long-Context Latent Diffusion
- Paper
- Code
โช๏ธMake-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models
- Paper
โช๏ธNoise2Music
โช๏ธRAVE2
- Paper
- Code
โช๏ธMusicLM: Generating Music From Text
- Paper
โช๏ธMsanii: High Fidelity Music Synthesis on a Shoestring Budget
- Paper
- Code
- HuggingFace
โช๏ธArchiSound: Audio Generation with Diffusion
- Paper
- Code
โช๏ธVALL-E: Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
- Paper
๐@computer_science_and_programming
Here we will keep track of the latest AI models for audio generation, starting in 2023!
โช๏ธSingSong: Generating musical accompaniments from singing
- Paper
โช๏ธAudioLDM: Text-to-Audio Generation with Latent Diffusion Models
- Paper
- Code
โช๏ธMoรปsai: Text-to-Music Generation with Long-Context Latent Diffusion
- Paper
- Code
โช๏ธMake-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models
- Paper
โช๏ธNoise2Music
โช๏ธRAVE2
- Paper
- Code
โช๏ธMusicLM: Generating Music From Text
- Paper
โช๏ธMsanii: High Fidelity Music Synthesis on a Shoestring Budget
- Paper
- Code
- HuggingFace
โช๏ธArchiSound: Audio Generation with Diffusion
- Paper
- Code
โช๏ธVALL-E: Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
- Paper
๐@computer_science_and_programming
๐174๐4
This media is not supported in your browser
VIEW IN TELEGRAM
Gen-1: The Next Step Forward for Generative AI
Use words and images to generate new videos out of existing
Introducing Gen-1: a new AI model that uses language and images to generate new videos out of existing ones.
https://research.runwayml.com/gen1
โญ๏ธ Project:
https://research.runwayml.com/gen1
โ Paper:
https://arxiv.org/abs/2302.03011
๐Request form:
https://docs.google.com/forms/d/e/1FAIpQLSfU0O_i1dym30hEI33teAvCRQ1i8UrGgXd4BPrvBWaOnDgs9g/viewform
๐@computer_science_and_programming
Use words and images to generate new videos out of existing
Introducing Gen-1: a new AI model that uses language and images to generate new videos out of existing ones.
https://research.runwayml.com/gen1
โญ๏ธ Project:
https://research.runwayml.com/gen1
โ Paper:
https://arxiv.org/abs/2302.03011
๐Request form:
https://docs.google.com/forms/d/e/1FAIpQLSfU0O_i1dym30hEI33teAvCRQ1i8UrGgXd4BPrvBWaOnDgs9g/viewform
๐@computer_science_and_programming
๐154๐7
This media is not supported in your browser
VIEW IN TELEGRAM
YOWOv2: A Stronger yet Efficient Multi-level Detection Framework for Real-time Spatio-temporal Action Detection
SPATIO-temporal action detection (STAD) aims to detect action instances in the current frame, which it has been widely applied, such as video surveillance and somatosensory game.
Paper:
https://arxiv.org/pdf/2302.06848.pdf
Github:
https://github.com/yjh0410/YOWOv2
Dataset:
https://drive.google.com/file/d/1Dwh90pRi7uGkH5qLRjQIFiEmMJrAog5J/view?usp=sharing
๐@computer_science_and_programming
SPATIO-temporal action detection (STAD) aims to detect action instances in the current frame, which it has been widely applied, such as video surveillance and somatosensory game.
Paper:
https://arxiv.org/pdf/2302.06848.pdf
Github:
https://github.com/yjh0410/YOWOv2
Dataset:
https://drive.google.com/file/d/1Dwh90pRi7uGkH5qLRjQIFiEmMJrAog5J/view?usp=sharing
๐@computer_science_and_programming
๐131๐4
This media is not supported in your browser
VIEW IN TELEGRAM
3D-aware Conditional Image Synthesis (pix2pix3D)
Pix2pix3D synthesizes 3D objects (neural fields) given a 2D label map, such as a segmentation or edge map
Github:
https://github.com/dunbar12138/pix2pix3D
Paper:
https://arxiv.org/abs/2302.08509
Project:
https://www.cs.cmu.edu/~pix2pix3D/
Datasets:
CelebAMask , AFHQ-Cat-Seg , Shapenet-Car-Edge
๐@computer_science_and_programming
Pix2pix3D synthesizes 3D objects (neural fields) given a 2D label map, such as a segmentation or edge map
Github:
https://github.com/dunbar12138/pix2pix3D
Paper:
https://arxiv.org/abs/2302.08509
Project:
https://www.cs.cmu.edu/~pix2pix3D/
Datasets:
CelebAMask , AFHQ-Cat-Seg , Shapenet-Car-Edge
๐@computer_science_and_programming
๐192๐6
Efficient Teacher: Semi-Supervised Object Detection for YOLOv5
โ Efficient Teacher introduces semi-supervised object detection into practical applications, enabling users to obtain a strong generalization capability with only a small amount of labeled data and large amount of unlabeled data.
โ Efficient Teacher provides category and custom uniform sampling, which can quickly improve the network performance in actual business scenarios.
Paper:
https://arxiv.org/abs/2302.07577
Github:
https://github.com/AlibabaResearch/efficientteacher
๐@computer_science_and_programming
โ Efficient Teacher introduces semi-supervised object detection into practical applications, enabling users to obtain a strong generalization capability with only a small amount of labeled data and large amount of unlabeled data.
โ Efficient Teacher provides category and custom uniform sampling, which can quickly improve the network performance in actual business scenarios.
Paper:
https://arxiv.org/abs/2302.07577
Github:
https://github.com/AlibabaResearch/efficientteacher
๐@computer_science_and_programming
๐174๐2
Multivariate Probabilistic Time Series Forecasting with Informer
Efficient transformer-based model for LSTF.
Method introduces a Probabilistic Attention mechanism to select the โactiveโ queries rather than the โlazyโ queries and provides a sparse Transformer thus mitigating the quadratic compute and memory requirements of vanilla attention.
๐คHugging face:
https://huggingface.co/blog/informer
โฉ Paper:
https://huggingface.co/docs/transformers/main/en/model_doc/informer
โญ๏ธ Colab:
https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multivariate_informer.ipynb
๐จ Dataset:
https://huggingface.co/docs/datasets/v2.7.0/en/package_reference/main_classes#datasets.Dataset.set_transform
๐@computer_science_and_programming
Efficient transformer-based model for LSTF.
Method introduces a Probabilistic Attention mechanism to select the โactiveโ queries rather than the โlazyโ queries and provides a sparse Transformer thus mitigating the quadratic compute and memory requirements of vanilla attention.
๐คHugging face:
https://huggingface.co/blog/informer
โฉ Paper:
https://huggingface.co/docs/transformers/main/en/model_doc/informer
โญ๏ธ Colab:
https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multivariate_informer.ipynb
๐จ Dataset:
https://huggingface.co/docs/datasets/v2.7.0/en/package_reference/main_classes#datasets.Dataset.set_transform
๐@computer_science_and_programming
๐180๐8
This media is not supported in your browser
VIEW IN TELEGRAM
ViperGPT: Visual Inference via Python Execution for Reasoning
ViperGPT, a framework that leverages code-generation models to compose vision-and-language models into subroutines to produce a result for any query.
Github:
https://github.com/cvlab-columbia/viper
Paper:
https://arxiv.org/pdf/2303.08128.pdf
Project:
https://paperswithcode.com/dataset/beat
๐@computer_science_and_programming
ViperGPT, a framework that leverages code-generation models to compose vision-and-language models into subroutines to produce a result for any query.
Github:
https://github.com/cvlab-columbia/viper
Paper:
https://arxiv.org/pdf/2303.08128.pdf
Project:
https://paperswithcode.com/dataset/beat
๐@computer_science_and_programming
๐225๐7
This media is not supported in your browser
VIEW IN TELEGRAM
Test of Time: Instilling Video-Language Models with a Sense of Time
GPT-5 will likely have video abilities, but will it have a sense of time? Here is answer to this question in #CVPR2023 paper by student of University of Amsterdam to learn how to instil time into video-language foundation models.
Paper:
https://arxiv.org/abs/2301.02074
Code:
https://github.com/bpiyush/TestOfTime
Project Page:
https://bpiyush.github.io/testoftime-website/
๐ @computer_science_and_programming
GPT-5 will likely have video abilities, but will it have a sense of time? Here is answer to this question in #CVPR2023 paper by student of University of Amsterdam to learn how to instil time into video-language foundation models.
Paper:
https://arxiv.org/abs/2301.02074
Code:
https://github.com/bpiyush/TestOfTime
Project Page:
https://bpiyush.github.io/testoftime-website/
๐ @computer_science_and_programming
๐180๐7
DragGAN.gif
20.6 MB
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold
Paper:
https://arxiv.org/abs/2305.10973
Github:
https://github.com/XingangPan/DragGAN
Project page:
https://vcai.mpi-inf.mpg.de/projects/DragGAN/
๐ @computer_science_and_programming
Paper:
https://arxiv.org/abs/2305.10973
Github:
https://github.com/XingangPan/DragGAN
Project page:
https://vcai.mpi-inf.mpg.de/projects/DragGAN/
๐ @computer_science_and_programming
๐182๐10
๐ญ GRES: Generalized Referring Expression Segmentation
New benchmark (GRES), which extends the classic RES to allow expressions to refer to an arbitrary number of target objects.
๐ฅ Github: https://github.com/henghuiding/ReLA
โฉ Paper: https://arxiv.org/abs/2306.00968
๐ Project: https://henghuiding.github.io/GRES/
๐ New dataset: https://github.com/henghuiding/gRefCOCO
๐ @computer_science_and_programming
New benchmark (GRES), which extends the classic RES to allow expressions to refer to an arbitrary number of target objects.
๐ฅ Github: https://github.com/henghuiding/ReLA
โฉ Paper: https://arxiv.org/abs/2306.00968
๐ Project: https://henghuiding.github.io/GRES/
๐ New dataset: https://github.com/henghuiding/gRefCOCO
๐ @computer_science_and_programming
๐131โค1๐1
80+ Jupyter Notebook tutorials on image classification, object detection and image segmentation in various domains
๐ Agriculture and Food
๐ Medical and Healthcare
๐ Satellite
๐ Security and Surveillance
๐ ADAS and Self Driving Cars
๐ Retail and E-Commerce
๐ Wildlife
Classification library
https://github.com/Tessellate-Imaging/monk_v1
Notebooks - https://github.com/Tessellate-Imaging/monk_v1/tree/master/study_roadmaps/4_image_classification_zoo
Detection and Segmentation Library
https://github.com/Tessellate-Imaging/
Monk_Object_Detection
Notebooks: https://github.com/Tessellate-Imaging/Monk_Object_Detection/tree/master/application_model_zoo
๐ @computer_science_and_programming
๐ Agriculture and Food
๐ Medical and Healthcare
๐ Satellite
๐ Security and Surveillance
๐ ADAS and Self Driving Cars
๐ Retail and E-Commerce
๐ Wildlife
Classification library
https://github.com/Tessellate-Imaging/monk_v1
Notebooks - https://github.com/Tessellate-Imaging/monk_v1/tree/master/study_roadmaps/4_image_classification_zoo
Detection and Segmentation Library
https://github.com/Tessellate-Imaging/
Monk_Object_Detection
Notebooks: https://github.com/Tessellate-Imaging/Monk_Object_Detection/tree/master/application_model_zoo
๐ @computer_science_and_programming
๐305๐16
This media is not supported in your browser
VIEW IN TELEGRAM
๐๐ผ๐ ๐๐ผ ๐๐ฒ๐๐ ๐๐ผ๐๐ฟ ๐๐ฃ๐๐ ๐ฑ๐ถ๐ฟ๐ฒ๐ฐ๐๐น๐ ๐ณ๐ฟ๐ผ๐บ ๐ฉ๐ถ๐๐๐ฎ๐น ๐ฆ๐๐๐ฑ๐ถ๐ผ ๐๐ผ๐ฑ๐ฒ?
You can immediately do this from your Visual Studio Code, as Postman just released a VS Code extension that integrates API building and testing into your code editor.
What you can do with the extension:
๐น๐ฆ๐ฒ๐ป๐ฑ (๐บ๐๐น๐๐ถ๐ฝ๐ฟ๐ผ๐๐ผ๐ฐ๐ผ๐น) ๐ฟ๐ฒ๐พ๐๐ฒ๐๐๐
๐น๐ฆ๐ฒ๐ป๐ฑ ๐ฟ๐ฒ๐พ๐๐ฒ๐๐๐ ๐ณ๐ฟ๐ผ๐บ ๐๐ผ๐๐ฟ ๐ต๐ถ๐๐๐ผ๐ฟ๐
๐น๐จ๐๐ฒ ๐ฐ๐ผ๐น๐น๐ฒ๐ฐ๐๐ถ๐ผ๐ป๐
๐น๐จ๐๐ฒ ๐ฑ๐ถ๐ณ๐ณ๐ฒ๐ฟ๐ฒ๐ป๐ ๐ฒ๐ป๐๐ถ๐ฟ๐ผ๐ป๐บ๐ฒ๐ป๐๐
๐น๐ฉ๐ถ๐ฒ๐ ๐ฎ๐ป๐ฑ ๐ฒ๐ฑ๐ถ๐ ๐ฐ๐ผ๐ผ๐ธ๐ถ๐ฒ๐
โก๏ธ Check it here
You can immediately do this from your Visual Studio Code, as Postman just released a VS Code extension that integrates API building and testing into your code editor.
What you can do with the extension:
๐น๐ฆ๐ฒ๐ป๐ฑ (๐บ๐๐น๐๐ถ๐ฝ๐ฟ๐ผ๐๐ผ๐ฐ๐ผ๐น) ๐ฟ๐ฒ๐พ๐๐ฒ๐๐๐
๐น๐ฆ๐ฒ๐ป๐ฑ ๐ฟ๐ฒ๐พ๐๐ฒ๐๐๐ ๐ณ๐ฟ๐ผ๐บ ๐๐ผ๐๐ฟ ๐ต๐ถ๐๐๐ผ๐ฟ๐
๐น๐จ๐๐ฒ ๐ฐ๐ผ๐น๐น๐ฒ๐ฐ๐๐ถ๐ผ๐ป๐
๐น๐จ๐๐ฒ ๐ฑ๐ถ๐ณ๐ณ๐ฒ๐ฟ๐ฒ๐ป๐ ๐ฒ๐ป๐๐ถ๐ฟ๐ผ๐ป๐บ๐ฒ๐ป๐๐
๐น๐ฉ๐ถ๐ฒ๐ ๐ฎ๐ป๐ฑ ๐ฒ๐ฑ๐ถ๐ ๐ฐ๐ผ๐ผ๐ธ๐ถ๐ฒ๐
Please open Telegram to view this post
VIEW IN TELEGRAM
๐248โค3๐3
This media is not supported in your browser
VIEW IN TELEGRAM
Wondering how C++, Java, Python Work?
๐ต C++
C++ is like the superhero of programming languages. It's a compiled language, meaning your code is transformed into machine code that your computer can understand before it runs. This compilation process is crucial for efficiency and performance. C++ gives you precise control over memory and hardware, making it a top choice for systems programming and game development. It's like wielding a finely-tuned instrument in the world of code! ๐ธ๐ป
๐ด Java
Java, on the other hand, is the coffee of programming languages. It's a compiled language too but with a twist. Java code is compiled into bytecode, which runs on the Java Virtual Machine (JVM). This bytecode can run on any platform with a compatible JVM, making Java highly portable and platform-independent. It's a bit like sending your code to a virtual coffee machine that serves it up just the way you like it on any OS! โ๏ธ๐ผ
๐ Python
Python is the friendly neighborhood programming language. It's an interpreted language, which means there's no compilation step. Python code is executed line by line by the Python interpreter. This simplicity makes it great for beginners and rapid development. Python's extensive library ecosystem and easy syntax make it feel like you're scripting magic spells in a magical world! ๐ช๐
In the end, the choice of programming language depends on your project's needs and your personal preferences. Each language has its strengths and weaknesses, but they all share the goal of bringing your ideas to life through code. ๐๐ก
So, whether you're crafting the perfect C++ masterpiece, brewing up Java applications, or scripting Python magic, remember that programming languages are the tools that empower us to create amazing things in the digital realm. Embrace the language that speaks to you and keep coding! ๐ป๐
๐ต C++
C++ is like the superhero of programming languages. It's a compiled language, meaning your code is transformed into machine code that your computer can understand before it runs. This compilation process is crucial for efficiency and performance. C++ gives you precise control over memory and hardware, making it a top choice for systems programming and game development. It's like wielding a finely-tuned instrument in the world of code! ๐ธ๐ป
๐ด Java
Java, on the other hand, is the coffee of programming languages. It's a compiled language too but with a twist. Java code is compiled into bytecode, which runs on the Java Virtual Machine (JVM). This bytecode can run on any platform with a compatible JVM, making Java highly portable and platform-independent. It's a bit like sending your code to a virtual coffee machine that serves it up just the way you like it on any OS! โ๏ธ๐ผ
๐ Python
Python is the friendly neighborhood programming language. It's an interpreted language, which means there's no compilation step. Python code is executed line by line by the Python interpreter. This simplicity makes it great for beginners and rapid development. Python's extensive library ecosystem and easy syntax make it feel like you're scripting magic spells in a magical world! ๐ช๐
In the end, the choice of programming language depends on your project's needs and your personal preferences. Each language has its strengths and weaknesses, but they all share the goal of bringing your ideas to life through code. ๐๐ก
So, whether you're crafting the perfect C++ masterpiece, brewing up Java applications, or scripting Python magic, remember that programming languages are the tools that empower us to create amazing things in the digital realm. Embrace the language that speaks to you and keep coding! ๐ป๐
๐520๐6
This media is not supported in your browser
VIEW IN TELEGRAM
What is Kafka?
Kafka is an open-source, distributed event streaming platform that serves as the central nervous system for data in modern enterprises. It's designed to handle real-time data feeds, process them efficiently, and make them available for a variety of applications in real-time.
๐ Use Cases:
- Real-time Analytics
- Log Aggregation
- Event Sourcing
- Data Integration
- Machine Learning Pipelines
Kafka is an open-source, distributed event streaming platform that serves as the central nervous system for data in modern enterprises. It's designed to handle real-time data feeds, process them efficiently, and make them available for a variety of applications in real-time.
๐ Use Cases:
- Real-time Analytics
- Log Aggregation
- Event Sourcing
- Data Integration
- Machine Learning Pipelines
๐405๐5๐ฅ1
Which programming languages do you use/know?
Anonymous Poll
29%
Javascript
48%
Python
5%
Go
30%
Java
4%
Kotlin
12%
C#
3%
Swift
3%
Ruby
42%
C/C++
15%
Don't know any / want to study
๐420๐21
This media is not supported in your browser
VIEW IN TELEGRAM
Docker Architecture and Components
1. Docker Daemon (
- ๐ฅ๐ผ๐น๐ฒ: Manages Docker containers on a system.
- ๐ฅ๐ฒ๐๐ฝ๐ผ๐ป๐๐ถ๐ฏ๐ถ๐น๐ถ๐๐ถ๐ฒ๐: Building, running, and managing containers.
2. Docker Client (
- ๐ฅ๐ผ๐น๐ฒ: Interface through which users interact with Docker.
- ๐๐ผ๐บ๐บ๐ฎ๐ป๐ฑ๐: build, pull, run, etc.
3. Docker Images:
- ๐๐ฒ๐ณ๐ถ๐ป๐ถ๐๐ถ๐ผ๐ป: Read-only templates used to create containers.
- ๐ฅ๐ผ๐น๐ฒ: Serve as the basis for creating containers.
- ๐ฅ๐ฒ๐ด๐ถ๐๐๐ฟ๐/๐๐๐ฏ: A storage and distribution system for Docker images.
4. Docker Containers:
- ๐๐ฒ๐ณ๐ถ๐ป๐ถ๐๐ถ๐ผ๐ป: Runnable instances of Docker images.
- ๐ฅ๐ผ๐น๐ฒ: Encapsulate the application and its environment.
5. Docker Registry:
- ๐ฅ๐ผ๐น๐ฒ: Store Docker images.
- ๐ฃ๐๐ฏ๐น๐ถ๐ฐ ๐ฅ๐ฒ๐ด๐ถ๐๐๐ฟ๐: Docker Hub.
- ๐ฃ๐ฟ๐ถ๐๐ฎ๐๐ฒ ๐ฅ๐ฒ๐ด๐ถ๐๐๐ฟ๐: Can be hosted by users.
1. Docker Daemon (
dockerd
):- ๐ฅ๐ผ๐น๐ฒ: Manages Docker containers on a system.
- ๐ฅ๐ฒ๐๐ฝ๐ผ๐ป๐๐ถ๐ฏ๐ถ๐น๐ถ๐๐ถ๐ฒ๐: Building, running, and managing containers.
2. Docker Client (
docker
):- ๐ฅ๐ผ๐น๐ฒ: Interface through which users interact with Docker.
- ๐๐ผ๐บ๐บ๐ฎ๐ป๐ฑ๐: build, pull, run, etc.
3. Docker Images:
- ๐๐ฒ๐ณ๐ถ๐ป๐ถ๐๐ถ๐ผ๐ป: Read-only templates used to create containers.
- ๐ฅ๐ผ๐น๐ฒ: Serve as the basis for creating containers.
- ๐ฅ๐ฒ๐ด๐ถ๐๐๐ฟ๐/๐๐๐ฏ: A storage and distribution system for Docker images.
4. Docker Containers:
- ๐๐ฒ๐ณ๐ถ๐ป๐ถ๐๐ถ๐ผ๐ป: Runnable instances of Docker images.
- ๐ฅ๐ผ๐น๐ฒ: Encapsulate the application and its environment.
5. Docker Registry:
- ๐ฅ๐ผ๐น๐ฒ: Store Docker images.
- ๐ฃ๐๐ฏ๐น๐ถ๐ฐ ๐ฅ๐ฒ๐ด๐ถ๐๐๐ฟ๐: Docker Hub.
- ๐ฃ๐ฟ๐ถ๐๐ฎ๐๐ฒ ๐ฅ๐ฒ๐ด๐ถ๐๐๐ฟ๐: Can be hosted by users.
๐188๐4