subho406/OmniNet
Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain
Language: Python
#artificial_intelligence #deep_learning #image_captioning #machine_learning #multimodal_learning #multitask_learning #neural_network #nlp #transformer #video_recognition
Stars: 138 Issues: 0 Forks: 7
https://github.com/subho406/OmniNet
Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain
Language: Python
#artificial_intelligence #deep_learning #image_captioning #machine_learning #multimodal_learning #multitask_learning #neural_network #nlp #transformer #video_recognition
Stars: 138 Issues: 0 Forks: 7
https://github.com/subho406/OmniNet
GitHub
GitHub - subho406/OmniNet: Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning"…
Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain - GitHub - ...
EleutherAI/DALLE-mtf
Open-AI's DALL-E for large scale training in mesh-tensorflow.
Language: Python
#artificial_intelligence #autoregressive #multimodal #text_to_image #transformers #variational_autoencoder
Stars: 106 Issues: 2 Forks: 11
https://github.com/EleutherAI/DALLE-mtf
Open-AI's DALL-E for large scale training in mesh-tensorflow.
Language: Python
#artificial_intelligence #autoregressive #multimodal #text_to_image #transformers #variational_autoencoder
Stars: 106 Issues: 2 Forks: 11
https://github.com/EleutherAI/DALLE-mtf
GitHub
GitHub - EleutherAI/DALLE-mtf: Open-AI's DALL-E for large scale training in mesh-tensorflow.
Open-AI's DALL-E for large scale training in mesh-tensorflow. - EleutherAI/DALLE-mtf
lucidrains/CoCa-pytorch
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch
Language: Python
#artificial_intelligence #attention_mechanism #contrastive_learning #deep_learning #image_to_text #multimodal #transformers
Stars: 90 Issues: 0 Forks: 2
https://github.com/lucidrains/CoCa-pytorch
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch
Language: Python
#artificial_intelligence #attention_mechanism #contrastive_learning #deep_learning #image_to_text #multimodal #transformers
Stars: 90 Issues: 0 Forks: 2
https://github.com/lucidrains/CoCa-pytorch
GitHub
GitHub - lucidrains/CoCa-pytorch: Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch - lucidrains/CoCa-pytorch
jina-ai/discoart
Create Disco Diffusion artworks in one line
Language: Python
#creative_ai #cross_modal #dalle #diffusion #disco_diffusion #generative_art #multimodal #prompts
Stars: 213 Issues: 2 Forks: 11
https://github.com/jina-ai/discoart
Create Disco Diffusion artworks in one line
Language: Python
#creative_ai #cross_modal #dalle #diffusion #disco_diffusion #generative_art #multimodal #prompts
Stars: 213 Issues: 2 Forks: 11
https://github.com/jina-ai/discoart
GitHub
GitHub - jina-ai/discoart: 🪩 Create Disco Diffusion artworks in one line
🪩 Create Disco Diffusion artworks in one line. Contribute to jina-ai/discoart development by creating an account on GitHub.
clovaai/donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Language: Python
#computer_vision #document_ai #eccv_2022 #multimodal_pre_trained_model #nlp #ocr
Stars: 98 Issues: 2 Forks: 5
https://github.com/clovaai/donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Language: Python
#computer_vision #document_ai #eccv_2022 #multimodal_pre_trained_model #nlp #ocr
Stars: 98 Issues: 2 Forks: 5
https://github.com/clovaai/donut
GitHub
GitHub - clovaai/donut: Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator…
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022 - clovaai/donut
ilaria-manco/multimodal-ml-music
List of academic resources on Multimodal ML for Music
Language: TeX
#academic_publications #awesome_list #multimodal_data #multimodal_deep_learning #multimodal_learning #music_ai #music_information_retrieval #music_research #resources
Stars: 123 Issues: 1 Forks: 7
https://github.com/ilaria-manco/multimodal-ml-music
List of academic resources on Multimodal ML for Music
Language: TeX
#academic_publications #awesome_list #multimodal_data #multimodal_deep_learning #multimodal_learning #music_ai #music_information_retrieval #music_research #resources
Stars: 123 Issues: 1 Forks: 7
https://github.com/ilaria-manco/multimodal-ml-music
GitHub
GitHub - ilaria-manco/multimodal-ml-music: List of academic resources on Multimodal ML for Music
List of academic resources on Multimodal ML for Music - ilaria-manco/multimodal-ml-music
SkalskiP/courses
This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI)
Language: Python
#computer_vision #deep_learning #deep_neural_networks #machine_learning #mlops #multimodal #natural_language_processing #nlp #transformers #tutorial
Stars: 323 Issues: 0 Forks: 29
https://github.com/SkalskiP/courses
This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI)
Language: Python
#computer_vision #deep_learning #deep_neural_networks #machine_learning #mlops #multimodal #natural_language_processing #nlp #transformers #tutorial
Stars: 323 Issues: 0 Forks: 29
https://github.com/SkalskiP/courses
GitHub
GitHub - SkalskiP/courses: This repository is a curated collection of links to various courses and resources about Artificial Intelligence…
This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI) - SkalskiP/courses
haotian-liu/LLaVA
Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities.
Language: Python
#chatbot #chatgpt #gpt_4 #llama #llava #multimodal
Stars: 716 Issues: 14 Forks: 34
https://github.com/haotian-liu/LLaVA
Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities.
Language: Python
#chatbot #chatgpt #gpt_4 #llama #llava #multimodal
Stars: 716 Issues: 14 Forks: 34
https://github.com/haotian-liu/LLaVA
GitHub
GitHub - haotian-liu/LLaVA: [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond. - haotian-liu/LLaVA
open-mmlab/Multimodal-GPT
Multimodal-GPT
Language: Python
#flamingo #gpt #gpt_4 #llama #multimodal #transformer #vision_and_language
Stars: 244 Issues: 1 Forks: 12
https://github.com/open-mmlab/Multimodal-GPT
Multimodal-GPT
Language: Python
#flamingo #gpt #gpt_4 #llama #multimodal #transformer #vision_and_language
Stars: 244 Issues: 1 Forks: 12
https://github.com/open-mmlab/Multimodal-GPT
GitHub
GitHub - open-mmlab/Multimodal-GPT: Multimodal-GPT
Multimodal-GPT. Contribute to open-mmlab/Multimodal-GPT development by creating an account on GitHub.
X-PLUG/mPLUG-Owl
mPLUG-Owl🦉: Modularization Empowers Large Language Models with Multimodality
Language: Python
#alpaca #chatbot #chatgpt #computer_vision #damo #gpt #gpt4 #gpt4_api #huggingface #instruction_tuning #large_language_models #llama #mplug #mplug_owl #multimodal #pretraining #pytorch #transformer #visual_reasoning #visual_recognition
Stars: 209 Issues: 1 Forks: 9
https://github.com/X-PLUG/mPLUG-Owl
mPLUG-Owl🦉: Modularization Empowers Large Language Models with Multimodality
Language: Python
#alpaca #chatbot #chatgpt #computer_vision #damo #gpt #gpt4 #gpt4_api #huggingface #instruction_tuning #large_language_models #llama #mplug #mplug_owl #multimodal #pretraining #pytorch #transformer #visual_reasoning #visual_recognition
Stars: 209 Issues: 1 Forks: 9
https://github.com/X-PLUG/mPLUG-Owl
GitHub
GitHub - X-PLUG/mPLUG-Owl: mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family - X-PLUG/mPLUG-Owl
OpenGVLab/InternChat
InternChat allows you to interact with ChatGPT by clicking, dragging and drawing using a pointing device.
Language: Python
#chatgpt #click #foundation_model #gpt #gpt_4 #gradio #husky #image_captioning #internimage #langchain #llama #llm #multimodal #ocr #sam #segment_anything #vicuna #video #video_generation #vqa
Stars: 231 Issues: 1 Forks: 10
https://github.com/OpenGVLab/InternChat
InternChat allows you to interact with ChatGPT by clicking, dragging and drawing using a pointing device.
Language: Python
#chatgpt #click #foundation_model #gpt #gpt_4 #gradio #husky #image_captioning #internimage #langchain #llama #llm #multimodal #ocr #sam #segment_anything #vicuna #video #video_generation #vqa
Stars: 231 Issues: 1 Forks: 10
https://github.com/OpenGVLab/InternChat
GitHub
GitHub - OpenGVLab/InternGPT: InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now…
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editin...
kyegomez/tree-of-thoughts
Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Language: Python
#artificial_intelligence #chatgpt #deep_learning #gpt4 #multimodal #prompt #prompt_engineering #prompt_learning #prompt_tuning
Stars: 366 Issues: 7 Forks: 31
https://github.com/kyegomez/tree-of-thoughts
Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Language: Python
#artificial_intelligence #chatgpt #deep_learning #gpt4 #multimodal #prompt #prompt_engineering #prompt_learning #prompt_tuning
Stars: 366 Issues: 7 Forks: 31
https://github.com/kyegomez/tree-of-thoughts
GitHub
GitHub - kyegomez/tree-of-thoughts: Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large…
Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70% - GitHub - kyegomez/tree-of-thoughts: Plug i...
OFA-Sys/ONE-PEACE
A general representation modal across vision, audio, language modalities.
Language: Python
#audio_language #foundation_models #multimodal #representation_learning #vision_language
Stars: 185 Issues: 2 Forks: 5
https://github.com/OFA-Sys/ONE-PEACE
A general representation modal across vision, audio, language modalities.
Language: Python
#audio_language #foundation_models #multimodal #representation_learning #vision_language
Stars: 185 Issues: 2 Forks: 5
https://github.com/OFA-Sys/ONE-PEACE
GitHub
GitHub - OFA-Sys/ONE-PEACE: A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring…
A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities - OFA-Sys/ONE-PEACE
google/break-a-scene
Official implementation for "Break-A-Scene: Extracting Multiple Concepts from a Single Image" [SIGGRAPH Asia 2023]
Language: Python
#deep_learning #diffusion_models #generative_ai #multimodal #text_to_image
Stars: 164 Issues: 1 Forks: 4
https://github.com/google/break-a-scene
Official implementation for "Break-A-Scene: Extracting Multiple Concepts from a Single Image" [SIGGRAPH Asia 2023]
Language: Python
#deep_learning #diffusion_models #generative_ai #multimodal #text_to_image
Stars: 164 Issues: 1 Forks: 4
https://github.com/google/break-a-scene
GitHub
GitHub - google/break-a-scene: Official implementation for "Break-A-Scene: Extracting Multiple Concepts from a Single Image" [SIGGRAPH…
Official implementation for "Break-A-Scene: Extracting Multiple Concepts from a Single Image" [SIGGRAPH Asia 2023] - google/break-a-scene
lxe/llavavision
A simple "Be My Eyes" web app with a llama.cpp/llava backend
Language: JavaScript
#ai #artificial_intelligence #computer_vision #llama #llamacpp #llm #local_llm #machine_learning #multimodal #webapp
Stars: 284 Issues: 0 Forks: 7
https://github.com/lxe/llavavision
A simple "Be My Eyes" web app with a llama.cpp/llava backend
Language: JavaScript
#ai #artificial_intelligence #computer_vision #llama #llamacpp #llm #local_llm #machine_learning #multimodal #webapp
Stars: 284 Issues: 0 Forks: 7
https://github.com/lxe/llavavision
GitHub
GitHub - lxe/llavavision: A simple "Be My Eyes" web app with a llama.cpp/llava backend
A simple "Be My Eyes" web app with a llama.cpp/llava backend - lxe/llavavision