How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
🖥
https://github.com/opengvlab/internvl
🖥
https://github.com/opengvlab/internvl
GitHub
GitHub - OpenGVLab/InternVL: [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型 - OpenGVLab/InternVL
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
🖥
https://github.com/tencent/hunyuandit
🖥
https://github.com/tencent/hunyuandit
GitHub
GitHub - Tencent/HunyuanDiT: Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding - Tencent/HunyuanDiT
Scaling Synthetic Data Creation with 1,000,000,000 Personas
🖥https://github.com/tencent-ailab/persona-hub
🖥https://github.com/tencent-ailab/persona-hub
GitHub
GitHub - tencent-ailab/persona-hub: Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"
Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas" - tencent-ailab/persona-hub
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
🖥https://github.com/osu-nlp-group/hipporag
🖥https://github.com/osu-nlp-group/hipporag
GitHub
GitHub - OSU-NLP-Group/HippoRAG: [NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memory that enables…
[NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge across external documents. RAG + Knowledge Graphs + P...
Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
🖥https://github.com/g-u-n/be-your-outpainter
🖥https://github.com/g-u-n/be-your-outpainter
GitHub
GitHub - G-U-N/Be-Your-Outpainter: [ECCV 2024] Be-Your-Outpainter https://arxiv.org/abs/2403.13745
[ECCV 2024] Be-Your-Outpainter https://arxiv.org/abs/2403.13745 - G-U-N/Be-Your-Outpainter
Agentless: Demystifying LLM-based Software Engineering Agents
🖥https://github.com/OpenAutoCoder/Agentless
🖥https://github.com/OpenAutoCoder/Agentless
GitHub
GitHub - OpenAutoCoder/Agentless: Agentless🐱: an agentless approach to automatically solve software development problems
Agentless🐱: an agentless approach to automatically solve software development problems - OpenAutoCoder/Agentless
Scaling Synthetic Data Creation with 1,000,000,000 Personas
🖥https://github.com/tencent-ailab/persona-hub
🖥https://github.com/tencent-ailab/persona-hub
GitHub
GitHub - tencent-ailab/persona-hub: Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"
Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas" - tencent-ailab/persona-hub
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
🖥https://github.com/osu-nlp-group/hipporag
🖥https://github.com/osu-nlp-group/hipporag
GitHub
GitHub - OSU-NLP-Group/HippoRAG: [NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memory that enables…
[NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge across external documents. RAG + Knowledge Graphs + P...
Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
🖥https://github.com/g-u-n/be-your-outpainter
🖥https://github.com/g-u-n/be-your-outpainter
GitHub
GitHub - G-U-N/Be-Your-Outpainter: [ECCV 2024] Be-Your-Outpainter https://arxiv.org/abs/2403.13745
[ECCV 2024] Be-Your-Outpainter https://arxiv.org/abs/2403.13745 - G-U-N/Be-Your-Outpainter
Agentless: Demystifying LLM-based Software Engineering Agents
🖥https://github.com/OpenAutoCoder/Agentless
🖥https://github.com/OpenAutoCoder/Agentless
GitHub
GitHub - OpenAutoCoder/Agentless: Agentless🐱: an agentless approach to automatically solve software development problems
Agentless🐱: an agentless approach to automatically solve software development problems - OpenAutoCoder/Agentless
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
🖥https://github.com/buoyancy99/diffusion-forcing
🖥https://github.com/buoyancy99/diffusion-forcing
GitHub
GitHub - buoyancy99/diffusion-forcing: code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion" - buoyancy99/diffusion-forcing
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention
🖥https://github.com/microsoft/MInference
🖥https://github.com/microsoft/MInference
GitHub
GitHub - microsoft/MInference: [NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and…
[NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference la...
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control
🖥https://github.com/KwaiVGI/LivePortrait
🖥https://github.com/KwaiVGI/LivePortrait
GitHub
GitHub - KwaiVGI/LivePortrait: Bring portraits to life!
Bring portraits to life! Contribute to KwaiVGI/LivePortrait development by creating an account on GitHub.
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
🖥https://github.com/test-time-training/ttt-lm-jax
🖥https://github.com/test-time-training/ttt-lm-jax
GitHub
GitHub - test-time-training/ttt-lm-jax: Official JAX implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden…
Official JAX implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States - test-time-training/ttt-lm-jax
ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation
🖥https://github.com/gair-nlp/anole
🖥https://github.com/gair-nlp/anole
GitHub
GitHub - GAIR-NLP/anole: Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation - GAIR-NLP/anole