GitHub repos

Helixform/CodeCursor
An extension for using Cursor in Visual Studio Code.
Language: Rust
#chatgpt #extension #gpt_4 #ide #openai #plugin #rust #typescript #visual_studio_code
Stars: 667 Issues: 3 Forks: 16
https://github.com/Helixform/CodeCursor

GitHub

GitHub - Helixform/CodeCursor: An extension for using Cursor in Visual Studio Code.

An extension for using Cursor in Visual Studio Code. - GitHub - Helixform/CodeCursor: An extension for using Cursor in Visual Studio Code.

2.6K views10:08

GitHub repos

z-x-yang/Segment-and-Track-Anything
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
Language: Jupyter Notebook
#interactive_segmentation #segment_anything #segment_anything_model #video_object_segmentation #visual_object_tracking
Stars: 474 Issues: 8 Forks: 43
https://github.com/z-x-yang/Segment-and-Track-Anything

GitHub

GitHub - z-x-yang/Segment-and-Track-Anything: An open-source project dedicated to tracking and segmenting any objects in videos…

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo...

2.2K views16:09

GitHub repos

X-PLUG/mPLUG-Owl
mPLUG-Owl🦉: Modularization Empowers Large Language Models with Multimodality
Language: Python
#alpaca #chatbot #chatgpt #computer_vision #damo #gpt #gpt4 #gpt4_api #huggingface #instruction_tuning #large_language_models #llama #mplug #mplug_owl #multimodal #pretraining #pytorch #transformer #visual_reasoning #visual_recognition
Stars: 209 Issues: 1 Forks: 9
https://github.com/X-PLUG/mPLUG-Owl

GitHub

GitHub - X-PLUG/mPLUG-Owl: mPLUG-Owl: The Powerful Multi-modal Large Language Model Family

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family - X-PLUG/mPLUG-Owl

2.1K views22:10

GitHub repos

xNul/code-llama-for-vscode
Use Code Llama with Visual Studio Code and the Continue extension. A local LLM alternative to GitHub Copilot.
Language: Python
#assistant #code #code_llama #codellama #continue #continuedev #copilot #llama #llama2 #llamacpp #llm #local #meta #ollama #studio #visual #vscode
Stars: 170 Issues: 3 Forks: 6
https://github.com/xNul/code-llama-for-vscode

GitHub

GitHub - xNul/code-llama-for-vscode: Use Code Llama with Visual Studio Code and the Continue extension. A local LLM alternative…

Use Code Llama with Visual Studio Code and the Continue extension. A local LLM alternative to GitHub Copilot. - xNul/code-llama-for-vscode

3.3K views16:17

GitHub repos

roboflow/multimodal-maestro
Effective prompting for Large Multimodal Models like GPT-4 Vision or LLaVA. 🔥
Language: Python
#cross_modal #gpt_4 #gpt_4_vision #instance_segmentation #llava #lmm #multimodality #object_detection #prompt_engineering #segment_anything #vision_language_model #visual_prompting
Stars: 367 Issues: 1 Forks: 23
https://github.com/roboflow/multimodal-maestro

GitHub

GitHub - roboflow/maestro: streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL - roboflow/maestro

2.4K views11:22

GitHub repos

CircleRadon/Osprey
The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"
Language: Python
#mllm #pixel_understanding #sam #visual_instruction_tuning
Stars: 200 Issues: 1 Forks: 6
https://github.com/CircleRadon/Osprey

GitHub

GitHub - CircleRadon/Osprey: [CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"

[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning" - CircleRadon/Osprey

2.0K views23:23

GitHub repos

voideditor/void
Language: TypeScript
#chatgpt #claude #copilot #cursor #developer_tools #editor #llm #open_source #openai #visual_studio_code #vscode #vscode_extension
Stars: 383 Issues: 1 Forks: 14
https://github.com/voideditor/void

GitHub

GitHub - voideditor/void

Contribute to voideditor/void development by creating an account on GitHub.

2.0K views10:00

GitHub repos

IDEA-Research/DINO-X-API
DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding
Language: Python
#open_set_object_detection #open_set_object_segmentation #pose_estimation #region_caption #visual_prompt
Stars: 224 Issues: 3 Forks: 11
https://github.com/IDEA-Research/DINO-X-API

GitHub

GitHub - IDEA-Research/DINO-X-API: DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding

DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding - IDEA-Research/DINO-X-API

1.7K views17:00

GitHub repos

ictnlp/LLaVA-Mini
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
Language: Python
#efficient #gpt4o #gpt4v #large_language_models #large_multimodal_models #llama #llava #multimodal #multimodal_large_language_models #video #vision #vision_language_model #visual_instruction_tuning
Stars: 173 Issues: 7 Forks: 11
https://github.com/ictnlp/LLaVA-Mini

GitHub

GitHub - ictnlp/LLaVA-Mini: LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images,…

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner. - GitHub - ictnlp/LLaVA-Mini: LLaVA-Mi...

1.8K views23:00

GitHub repos

liweiphys/layra
LAYRA is a ready-to-use visual RAG system with a complete UI built with Next.js and FastAPI, preserving document layout, tables, paragraphs, and graphical elements without any structural fragmentation.
Language: TypeScript
#agent #colpali #colqwen #document_parser #fastapi #gpt_4o #knowledge_base #llm #nextjs #pdf_parser #qwen #rag #visual_rag
Stars: 190 Issues: 3 Forks: 15
https://github.com/liweiphys/layra

GitHub

GitHub - liweiphys/layra: LAYRA unlocks next-generation intelligent systems—powered by vision-driven RAG and multi-step agent …

LAYRA unlocks next-generation intelligent systems—powered by vision-driven RAG and multi-step agent orchestration—with no limits, no compromises. - liweiphys/layra

1.6K views16:00

About

Blog

Apps

Platform