subho406/OmniNet
Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain
Language: Python
#artificial_intelligence #deep_learning #image_captioning #machine_learning #multimodal_learning #multitask_learning #neural_network #nlp #transformer #video_recognition
Stars: 138 Issues: 0 Forks: 7
https://github.com/subho406/OmniNet
Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain
Language: Python
#artificial_intelligence #deep_learning #image_captioning #machine_learning #multimodal_learning #multitask_learning #neural_network #nlp #transformer #video_recognition
Stars: 138 Issues: 0 Forks: 7
https://github.com/subho406/OmniNet
GitHub
GitHub - subho406/OmniNet: Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning"…
Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain - GitHub - ...
EleutherAI/DALLE-mtf
Open-AI's DALL-E for large scale training in mesh-tensorflow.
Language: Python
#artificial_intelligence #autoregressive #multimodal #text_to_image #transformers #variational_autoencoder
Stars: 106 Issues: 2 Forks: 11
https://github.com/EleutherAI/DALLE-mtf
Open-AI's DALL-E for large scale training in mesh-tensorflow.
Language: Python
#artificial_intelligence #autoregressive #multimodal #text_to_image #transformers #variational_autoencoder
Stars: 106 Issues: 2 Forks: 11
https://github.com/EleutherAI/DALLE-mtf
GitHub
GitHub - EleutherAI/DALLE-mtf: Open-AI's DALL-E for large scale training in mesh-tensorflow.
Open-AI's DALL-E for large scale training in mesh-tensorflow. - EleutherAI/DALLE-mtf
lucidrains/CoCa-pytorch
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch
Language: Python
#artificial_intelligence #attention_mechanism #contrastive_learning #deep_learning #image_to_text #multimodal #transformers
Stars: 90 Issues: 0 Forks: 2
https://github.com/lucidrains/CoCa-pytorch
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch
Language: Python
#artificial_intelligence #attention_mechanism #contrastive_learning #deep_learning #image_to_text #multimodal #transformers
Stars: 90 Issues: 0 Forks: 2
https://github.com/lucidrains/CoCa-pytorch
GitHub
GitHub - lucidrains/CoCa-pytorch: Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch - lucidrains/CoCa-pytorch
jina-ai/discoart
Create Disco Diffusion artworks in one line
Language: Python
#creative_ai #cross_modal #dalle #diffusion #disco_diffusion #generative_art #multimodal #prompts
Stars: 213 Issues: 2 Forks: 11
https://github.com/jina-ai/discoart
Create Disco Diffusion artworks in one line
Language: Python
#creative_ai #cross_modal #dalle #diffusion #disco_diffusion #generative_art #multimodal #prompts
Stars: 213 Issues: 2 Forks: 11
https://github.com/jina-ai/discoart
GitHub
GitHub - jina-ai/discoart: 🪩 Create Disco Diffusion artworks in one line
🪩 Create Disco Diffusion artworks in one line. Contribute to jina-ai/discoart development by creating an account on GitHub.
clovaai/donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Language: Python
#computer_vision #document_ai #eccv_2022 #multimodal_pre_trained_model #nlp #ocr
Stars: 98 Issues: 2 Forks: 5
https://github.com/clovaai/donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Language: Python
#computer_vision #document_ai #eccv_2022 #multimodal_pre_trained_model #nlp #ocr
Stars: 98 Issues: 2 Forks: 5
https://github.com/clovaai/donut
GitHub
GitHub - clovaai/donut: Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator…
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022 - clovaai/donut
ilaria-manco/multimodal-ml-music
List of academic resources on Multimodal ML for Music
Language: TeX
#academic_publications #awesome_list #multimodal_data #multimodal_deep_learning #multimodal_learning #music_ai #music_information_retrieval #music_research #resources
Stars: 123 Issues: 1 Forks: 7
https://github.com/ilaria-manco/multimodal-ml-music
List of academic resources on Multimodal ML for Music
Language: TeX
#academic_publications #awesome_list #multimodal_data #multimodal_deep_learning #multimodal_learning #music_ai #music_information_retrieval #music_research #resources
Stars: 123 Issues: 1 Forks: 7
https://github.com/ilaria-manco/multimodal-ml-music
GitHub
GitHub - ilaria-manco/multimodal-ml-music: List of academic resources on Multimodal ML for Music
List of academic resources on Multimodal ML for Music - ilaria-manco/multimodal-ml-music
SkalskiP/courses
This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI)
Language: Python
#computer_vision #deep_learning #deep_neural_networks #machine_learning #mlops #multimodal #natural_language_processing #nlp #transformers #tutorial
Stars: 323 Issues: 0 Forks: 29
https://github.com/SkalskiP/courses
This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI)
Language: Python
#computer_vision #deep_learning #deep_neural_networks #machine_learning #mlops #multimodal #natural_language_processing #nlp #transformers #tutorial
Stars: 323 Issues: 0 Forks: 29
https://github.com/SkalskiP/courses
GitHub
GitHub - SkalskiP/courses: This repository is a curated collection of links to various courses and resources about Artificial Intelligence…
This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI) - SkalskiP/courses
haotian-liu/LLaVA
Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities.
Language: Python
#chatbot #chatgpt #gpt_4 #llama #llava #multimodal
Stars: 716 Issues: 14 Forks: 34
https://github.com/haotian-liu/LLaVA
Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities.
Language: Python
#chatbot #chatgpt #gpt_4 #llama #llava #multimodal
Stars: 716 Issues: 14 Forks: 34
https://github.com/haotian-liu/LLaVA
GitHub
GitHub - haotian-liu/LLaVA: [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond. - haotian-liu/LLaVA
open-mmlab/Multimodal-GPT
Multimodal-GPT
Language: Python
#flamingo #gpt #gpt_4 #llama #multimodal #transformer #vision_and_language
Stars: 244 Issues: 1 Forks: 12
https://github.com/open-mmlab/Multimodal-GPT
Multimodal-GPT
Language: Python
#flamingo #gpt #gpt_4 #llama #multimodal #transformer #vision_and_language
Stars: 244 Issues: 1 Forks: 12
https://github.com/open-mmlab/Multimodal-GPT
GitHub
GitHub - open-mmlab/Multimodal-GPT: Multimodal-GPT
Multimodal-GPT. Contribute to open-mmlab/Multimodal-GPT development by creating an account on GitHub.
X-PLUG/mPLUG-Owl
mPLUG-Owl🦉: Modularization Empowers Large Language Models with Multimodality
Language: Python
#alpaca #chatbot #chatgpt #computer_vision #damo #gpt #gpt4 #gpt4_api #huggingface #instruction_tuning #large_language_models #llama #mplug #mplug_owl #multimodal #pretraining #pytorch #transformer #visual_reasoning #visual_recognition
Stars: 209 Issues: 1 Forks: 9
https://github.com/X-PLUG/mPLUG-Owl
mPLUG-Owl🦉: Modularization Empowers Large Language Models with Multimodality
Language: Python
#alpaca #chatbot #chatgpt #computer_vision #damo #gpt #gpt4 #gpt4_api #huggingface #instruction_tuning #large_language_models #llama #mplug #mplug_owl #multimodal #pretraining #pytorch #transformer #visual_reasoning #visual_recognition
Stars: 209 Issues: 1 Forks: 9
https://github.com/X-PLUG/mPLUG-Owl
GitHub
GitHub - X-PLUG/mPLUG-Owl: mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family - X-PLUG/mPLUG-Owl