NodeShift Announcements Official

Datalab just released their next-generation OCR model — Chandra!

Chandra is a powerful vision-language OCR model built for precise document understanding. It doesn’t just extract text — it reconstructs full document layouts into clean Markdown, HTML, or JSON formats, handling tables, forms, diagrams, handwriting, math equations, and multi-column pages with ease.

Supporting over 40 languages, Chandra achieves an impressive 83.1% overall accuracy on the olmOCR benchmark, outperforming many open and commercial OCR systems.

We’ve just published a comprehensive guide that walks you through everything — from setting up Chandra on a GPU-powered NodeShift Cloud VM, installing dependencies, and running the model with Transformers and vLLM, to launching a full Streamlit web app for interactive document analysis in the browser.

Whether you’re a researcher, developer, or just passionate about document AI, this guide will help you get Chandra running end-to-end — from terminal to web UI.

Check out the full guide here: https://nodeshift.cloud/blog/how-to-install-run-chandra-ocr-locally

NodeShift Cloud

How to Install & Run Chandra-OCR Locally?

Chandra is Datalab’s next-generation OCR model built for precise document understanding. It goes beyond simple text extraction — converting images and PDFs into structured Markdown, HTML, or JSON while preserving original layout details like tables, forms…

❤1🔥1

134 views11:40

NodeShift Announcements Official

Baidu's PaddleOCR-VL is the new SOTA vision-language model redefining document understanding and trending as one of the top OCR models along with big models like DeepSeek OCR.
This is a compact yet insanely capable OCR-VLM that blends:
- NaViT-style dynamic visual encoding
- ERNIE-4.5-0.3B language model
- Support for 109 languages
- Lightning-fast, resource-efficient inference

It doesn’t just read documents, it understands and explains them. From complex tables and formulas to multi-lingual text and charts, PaddleOCR-VL achieves state-of-the-art accuracy while staying lightweight enough for real-world deployment.
At NodeShift, we made it even easier to install, run, and benchmark PaddleOCR-VL locally, so you can experience its power without the complex setup friction.
🔗 Read here: https://nodeshift.cloud/blog/how-to-install-and-run-paddleocr-vl-locally?utm_source=telegram&utm_medium=social&utm_campaign=paddleocr-vl-launch

NodeShift Cloud

How to Install and Run PaddleOCR-VL Locally

The field of document understanding has seen a surge of multimodal models, but few manage to balance accuracy, multilingual versatility, and computational efficiency the way PaddleOCR-VL-0.9B does. This state-of-the-art (SOTA) vision-language model from PaddlePaddle…

132 views11:24

NodeShift Announcements Official

Kimi Linear by Moonshot AI is the Future of Scalable Attention!
Imagine handling 1 million tokens with 6× faster decoding and 75% less memory, that’s what Kimi Linear delivers.

Built on the groundbreaking Kimi Delta Attention (KDA), it redefines how we process long-context data with unmatched speed, efficiency, and precision.

In our latest guide, we break down how to install and run Kimi Linear locally so you can experience next-gen attention models firsthand, right from your own setup. If you're into LLM research, RL-style reasoning, or long-context applications, this one’s a must-try.
🔗 Read full detailed article here: https://nodeshift.cloud/blog/a-step-by-step-guide-to-install-run-kimi-linear?utm_source=telegram&utm_medium=social&utm_campaign=kimi_linear_launch

NodeShift Cloud

A Step-By-Step Guide to Install & Run Kimi Linear

In an era where attention mechanisms are redefining efficiency in large language models, Kimi Linear emerges as a breakthrough innovation designed for extreme scalability without compromise. Built upon the novel Kimi Delta Attention (KDA) architecture, it…

❤3

126 views13:06

NodeShift Announcements Official

Meet JanusCoderV-8B — the next leap in visual-programmatic intelligence!

Developed by InternLM, JanusCoderV-8B is an 8-billion-parameter multimodal model built on InternVL-3.5-8B, trained on the massive JANUSCODE-800K corpus. It’s designed to unify vision and code, enabling image-conditioned code generation, visually grounded edits, and UI-to-code translation — all in one model.

What makes it special?
✅ It bridges the gap between visual context and programmatic logic.
✅ Generates HTML/CSS, charts, and interactive elements directly from screenshots or design mockups.
✅ Supports long-context outputs (up to 32K tokens) and runs smoothly on affordable GPUs using 8-bit or BF16 precision.

We’ve just published a new step-by-step guide:

How to Install & Run JanusCoderV-8B Locally — a complete walkthrough that covers:
✅ Setting up a GPU-powered VM on NodeShift Cloud
✅ Installing CUDA 12.1.1, Python 3.11, and PyTorch 2.5.1
✅ Configuring the environment for multimodal inference
✅ Running JanusCoderV-8B to generate image-based code and UI descriptions

Read the full guide here: https://nodeshift.cloud/blog/how-to-install-run-januscoderv-8b-locally

NodeShift Cloud

How to Install & Run JanusCoderV-8B Locally?

JanusCoderV-8B is an 8B multimodal code-intelligence model from InternLM’s JanusCoder suite, built on InternVL-3.5-8B. Trained on JANUSCODE-800K, it unifies visual + programmatic inputs to generate and edit code for charts, interactive web UIs, and animation…

🔥1

170 views07:03

NodeShift Announcements Official

AMD Releases Nitro-E — A Lightweight Text-to-Image Diffusion Model

AMD has introduced Nitro-E, a highly efficient text-to-image diffusion model built on the E-MMDiT architecture (~304M parameters). It’s designed for fast, low-cost training and inference, making image generation accessible even on modest GPU setups.

Key Highlights:
🔹 Ultra-light architecture (~304M params) with EMMDiT backbone.
🔹 Base 512px model delivers quality in ~20 steps.
🔹 Distilled 512px variant generates great results in just 4 steps.
🔹 GRPO-tuned checkpoint for improved post-training image quality.
🔹 Fully compatible with both NVIDIA (CUDA) and AMD (ROCm) GPUs.

We’ve just published a complete step-by-step guide covering everything you need to install and run AMD Nitro-E locally.

Inside this guide, you’ll learn:
✅ How to set up a GPU VM on NodeShift Cloud.
✅ Environment setup with Python 3.11, CUDA 12.1, and required libraries.
✅ Installation of PyTorch, Diffusers, and FlashAttention.
✅ Running Nitro-E in multiple modes — base, distilled, and GRPO-tuned.
✅ Generating your first AI image in just a few minutes.

Read the full guide here: https://nodeshift.cloud/blog/how-to-install-run-amd-nitro-e-locally

NodeShift Cloud

How to Install & Run AMD Nitro-E Locally?

Nitro-E is AMD’s ultra-light text-to-image diffusion family built on E-MMDiT (~304M params). It’s designed for fast, low-cost training/inference: the base 512px model gives strong quality in ~20 steps, while the distilled 512px variant can generate usable…

❤2🔥1

116 views09:36

NodeShift Announcements Official

2:20

Media is too big

VIEW IN TELEGRAM

OpenAI just released GPT-OSS-Safeguard — a new era of open-source safety reasoning!

OpenAI’s GPT-OSS-Safeguard 20B and 120B models are purpose-built for Trust & Safety, trained to interpret your own policy text, explain moderation decisions, and let you control reasoning effort (low / medium / high).

🔹 20B → optimized for 16 GB-class GPUs, perfect for low-latency filters & offline labeling
🔹 120B → high-fidelity safety reasoning, fits a single H100 80 GB via MoE + MXFP4 quantization
🔹 Fully open-weight under Apache 2.0, built on the GPT-OSS family
🔹 Requires the Harmony response format for interpretable

We’ve just published a step-by-step guide covering everything you need to:
✅ Deploy GPT-OSS-Safeguard Models
✅ Pull & run the models via Ollama CLI
✅ Launch Open WebUI for a visual chat experience
✅ Explore reasoning depth, labeling workflows, and real-world policy checks

Read the full guide here: https://nodeshift.cloud/blog/how-to-install-run-gpt-oss-safeguard-20b-and-120b-locally

❤2

114 views09:21

NodeShift Announcements Official

Meet VieNeu-TTS, the first-ever Vietnamese Text-to-Speech model that runs entirely on your personal device.
Built on the Qwen 0.5B LLM and fine-tuned from NeuTTS Air, it delivers hyper-realistic, natural voices with real-time inference, no heavy hardware dependency.
If you’re building voice assistants, educational tools, or offline AI applications, VieNeu-TTS is a game-changer for anyone who values speed, privacy, and quality.
In this step-by-step guide, we show you how to install and run VieNeu-TTS locally, and experience the future of Vietnamese voice AI right away.
🔗 Read the full article here: https://nodeshift.cloud/blog/how-to-install-run-vieneu-tts-locally-the-first-realistic-vietnamese-voice-ai?utm_source=telegram&utm_medium=social&utm_campaign=vieneu_tts_launch

NodeShift Cloud

How to Install & Run VieNeu-TTS Locally: The First Realistic Vietnamese Voice AI

The rapid evolution of Text-to-Speech (TTS) technology has finally reached a milestone for Vietnamese users with VieNeu-TTS, the first-ever Vietnamese TTS model capable of running entirely on personal devices. Fine-tuned from NeuTTS Air, this model brings…

👍2

117 views10:08

NodeShift Announcements Official

SoulX-Podcast-1.7B — Long-Form, Multi-Speaker TTS Is Here!

SoulX-Podcast-1.7B is a podcast-style TTS model built for long, multi-turn, multi-speaker dialogs. It supports English, Mandarin, and several Chinese dialects (Sichuanese, Henanese, Cantonese), performs zero-shot voice cloning from short clips, and even captures laughter, sighs, and emotions to make speech sound real.

It’s optimized for single-GPU inference, letting you generate entire podcast episodes with expressive delivery, natural tone, and dynamic speaker changes.

Key Highlights:
✅ Long-form conversational TTS with speaker variation
✅ Zero-shot cloning from short reference clips
✅ Multilingual + multi-dialect support
✅ Runs smoothly on 8–24 GB GPUs for smaller use cases
✅ Perfect for podcasts, storytelling, or research in expressive speech

We just published a new step-by-step guide covering:
✅ Complete NodeShift GPU setup (CUDA 12.1.1 devel image)
✅ Python 3.11 + Conda environment setup
✅ Installing PyTorch (cu121) and all dependencies
✅ Pulling base + dialect models from Hugging Face
✅ Running dialogue inference scripts
✅ Launching the Gradio WebUI for real-time podcast generation

Read the full guide here: https://nodeshift.cloud/blog/how-to-install-run-soulx-podcast-1-7b-locally

NodeShift Cloud

How to Install & Run SoulX-Podcast-1.7B Locally?

SoulX-Podcast-1.7B is a podcast-style TTS model built for long, multi-turn, multi-speaker dialogs. It supports English, Mandarin, and several Chinese dialects (e.g., Sichuanese, Henanese, Cantonese), does zero-shot voice cloning from short reference clips…

❤2

115 views10:20

NodeShift Announcements Official

The Future of Expressive AI Voice Generation is here - Meet Maya1 by Maya Research
Imagine a voice model that can laugh, cry, whisper, or sigh, all from a single text prompt describing the type of voice you want.
That’s exactly what Maya1 by Maya Research has to offer - a 3B-parameter speech model built for rich emotional realism, natural language voice design, and real-time streaming using the SNAC neural codec.
With NodeShift cloud you can run it locally on a single GPU, open-source under Apache 2.0.

In our new article, we walk you through how to install and run Maya1 locally, so you can start crafting lifelike, emotionally aware AI voices for your own projects from podcasts to storytelling to research.

🔗 Dive in now → https://nodeshift.cloud/blog/how-to-install-and-run-maya1-locally-create-emotion-rich-ai-voices-in-minutes?utm_source=telegram&utm_medium=social&utm_campaign=maya1_launch&utm_content=blog_post

NodeShift Cloud

How to Install and Run Maya1 Locally: Create Emotion-Rich AI Voices in Minutes

When it comes to bringing human emotion into synthetic voices, Maya1 by Maya Research stands out as one of the most expressive open-source speech models ever released. Built for precise voice design and emotional realism, Maya1 allows you to generate lifelike…

🔥2❤1

134 views11:13

NodeShift Announcements Official

Released just days ago, Aquif 3.5 Plus and Aquif 3.5 Max bring GPT-5-level intelligence, with next-generation reasoning power, and massive 1M-token context windows - all that can be easily run locally with NodeShift.

With hybrid reasoning modes, 3.3B active parameters, and multilingual support, Aquif 3.5 lets you toggle between speed and depth, from lightning-fast inference to deep scientific analysis.
And with NodeShift Cloud, you can deploy and run it effortlessly on your own hardware or custom GPU instances in minutes.

🔗 Read the full guide here: https://nodeshift.cloud/blog/how-to-install-aquif-3-5-plus-max-locally-the-open-source-models-with-gpt-5-level-reasoning?utm_source=telegram&utm_medium=social&utm_campaign=aquif3-5plusmax_blog

NodeShift Cloud

How to Install Aquif 3.5 Plus & Max Locally – The Open Source Models with GPT-5-Level Reasoning

Aquif 3.5 series marks a defining moment in open-source AI innovation, blending raw reasoning power, massive context windows, and cutting-edge efficiency into a form you can now run locally. Available in two flagship variants, Aquif 3.5 Plus and Aquif 3.5…

❤3🔥1

153 views10:58

NodeShift Announcements Official

You can now run Kimi K2 Thinking - the SOTA and most powerful open-source thinking agent model to date - fully locally!

This isn't just another chat model.
Kimi K2 Thinking by Moonshot AI can reason step-by-step, plan tasks, write code, and autonomously call tools - sustaining 200–300 sequential actions without losing direction.
And with the new GGUF quantized build by Unsloth AI, the massive 1T parameter model (1.09TB) is reduced to ~230GB - while retaining its deep reasoning performance. Meaning: you can actually run it with just a handful of H100s/H200s.

We just published a full guide showing how to install and run Kimi K2 Thinking locally:
• Setup requirements
• Setting up your local/NodeShift GPU environment
• Download & run GGUF with Llama.cpp
• Inference code for reasoning

If you're building autonomous agents, research copilots, or coding assistants, this is one model you’ll definitely want to try.

🔗 Read the full guide here: https://nodeshift.cloud/blog/how-to-install-and-run-kimi-k2-thinking-gguf-locally?utm_source=telegram&utm_medium=social&utm_campaign=kimi-k2-gguf-guide

NodeShift Cloud

How to Install and Run Kimi K2 Thinking GGUF Locally

Kimi K2 Thinking is one of the most advanced open-source reasoning models available today, combining a Mixture-of-Experts (MoE) architecture with a massive 1 trillion total parameters, yet it efficiently activates only 32 billion parameters per token, delivering…

🔥2

132 views12:41

NodeShift Announcements Official

AI at Meta just dropped Omnilingual ASR – an open-source speech recognition model suite built to support 1,600+ languages, including many that have never had reliable ASR support before.

This family combines Wav2Vec2, CTC, and LLM-based architectures to deliver:
✅ Scalable zero-shot transcription for new & low-resource languages
✅ State-of-the-art accuracy with the flagship omniASR_LLM_7B (CER < 10% for ~80% of supported languages)
✅ Easy integration with PyTorch, Fairseq2, and Hugging Face for real-world deployments

We’ve just published a new hands-on guide on how to run Omnilingual ASR on a GPU-powered VM using NodeShift Cloud.

What this guide covers:
✅ GPU configuration recommendations for all Omnilingual ASR model variants
✅ Step-by-step setup on a NodeShift GPU VM (Python 3.11, CUDA, libsndfile, dependencies)
✅ Installing & running omniASR_LLM_7B with the official ASRInferencePipeline
✅ A ready-to-use Gradio WebUI to upload audio (≤40s) and get instant multilingual transcripts
✅ Practical workflow tips for teams, researchers & community language projects

Read the full guide here: https://nodeshift.cloud/blog/how-to-install-run-omnilingual-asr-locally

NodeShift Cloud

How to Install & Run Omnilingual ASR Locally?

Omnilingual ASR is Meta’s groundbreaking open-source speech recognition system built to support over 1,600 languages, including hundreds never before covered by any ASR model. It’s designed for inclusivity — allowing new languages to be added with just a…

❤1

111 views08:38

NodeShift Announcements Official

What if you could cut your AI costs in half - just by switching to this new JSON alternative?

Meet TOON (Token-Oriented Object Notation), the smarter, token-optimized alternative to JSON.
It’s built to make your data lighter, faster, and more LLM-friendly, reducing token usage by up to 60% while keeping everything perfectly structured.

With NodeShift Cloud, you can easily test, deploy, and scale TOON-based workflows while further cutting your costs with affordable compute and turning AI token efficiency into real-world performance and savings.

Get Smarter data. Cheaper AI. Faster results.
🔗 Read the full article: https://nodeshift.cloud/blog/how-to-cut-your-ai-costs-in-half-with-toon-the-smarter-token-optimized-alternative-to-json?utm_source=telegram&utm_medium=social&utm_campaign=toon_launch&utm_content=post

NodeShift Cloud

How to Cut Your AI Costs in Half with TOON – The Smarter, Token-Optimized Alternative to JSON

Every token you send to an AI model costs money, and when your application scales, those costs can balloon fast. That’s where Token-Oriented Object Notation (TOON) steps in, offering a revolutionary way to save on API expenses without sacrificing data clarity…

👍2

129 views11:11

NodeShift Announcements Official

SAP releases its new open-source model — SAP-RPT-1-OSS (formerly ConTextTab)!

SAP-RPT-1-OSS, a table-native, semantics-aware in-context learner for classification and regression tasks. This model brings deep semantic understanding to tabular data by embedding column names and cell values, handling missing data automatically, and scaling performance with context size and bagging.

What makes SAP-RPT-1-OSS special
✅ Trained on real-world tabular datasets (not just synthetic data).
✅ Combines semantic alignment with tabular in-context learning.
✅ Requires no manual preprocessing — simply feed your DataFrame or NumPy array.
✅ Supports both classification and regression natively.
✅ Runs efficiently on a single GPU while maintaining state-of-the-art accuracy.

We’ve just released a comprehensive setup guide that walks you through:
✅ Deploying a GPU-powered NodeShift Virtual Machine
✅ Installing Python 3.11, dependencies, and authenticating with Hugging Face
✅ Cloning and running the SAP-RPT-1-OSS repository
✅ Running a sample classification test to verify everything works

Read the full tutorial here: https://nodeshift.cloud/blog/how-to-install-run-sap-rpt-1-oss-locally

NodeShift Cloud

How to Install & Run SAP-RPT-1-OSS Locally?

sap-rpt-1-oss is SAP’s table-native, semantics-aware in-context learner for classification and regression. It embeds column names and cell values (no manual preprocessing), handles missing data, and scales quality with context size and bagging. For peak accuracy…

❤2

144 views10:04

NodeShift Announcements Official

What if an AI could not only see an image but also think about it, like reasoning through data charts, solving STEM problems from photos, and finding interesting insights just by looking at your pictures?

That’s exactly what ERNIE-4.5-VL Thinking does.

Built on Baidu’s powerful 28B-parameter architecture, it brings deep multimodal reasoning, fine-grained grounding, and tool-assisted visual understanding to life - all while running efficiently with just 3B active parameters.

In our latest article, we break down how to install and run ERNIE 4.5 VL, Thinking, and how NodeShift Cloud makes deploying this next-gen model effortless for developers building visual-language agents and AI reasoning systems.

🔗 Read the full guide here: https://nodeshift.cloud/blog/how-to-install-and-run-ernie-4-5-vl-thinking-locally?utm_source=telegram&utm_medium=social&utm_campaign=ernie4-5-vl-thinking

NodeShift Cloud

How to Install and Run Ernie-4.5-VL-Thinking Locally

When multimodal intelligence meets deep reasoning, ERNIE-4.5-VL-28B-A3B-Thinking emerges as a latest game-changer. Built on Baidu’s powerful ERNIE-4.5-VL-28B-A3B architecture, this next-generation model redefines how AI perceives and reasons across both text…

🔥1

228 views10:35

NodeShift Announcements Official

Every week, HR is forced to spend hours rewriting the same things:
→ New policies
→ Updated job descriptions
→ Internal memos
→ Meeting notes
→ Small announcements that somehow take too long

But HR can’t use public AI tools like everyone else.
Why? Too much sensitive data. Too many risks. Too many compliance blockers.

All the employee data, salaries, performance notes, and internal decisions must stay inside the organisation.
That’s where a fully private, on-prem AI assistant changes everything.

In our latest article, we break down how a platform like NodeShift AI help HR teams securely automate daily workflows with:
- Faster policy drafting
- Accurate job description generation
- Instant memo + meeting summary creation
- Zero data leaving the organisation
- Support for 140+ enterprise-grade AI models (ChatGPT, Claude, Gemini, Mistral, LLaMA, DeepSeek & more)

If you’re in HR or People Ops, this is a must-read.
🔗 Read the full article here: https://nodeshift.cloud/blog/how-hr-teams-can-automate-internal-policies-job-descriptions-memos-using-a-fully-private-ai-assistant?utm_source=telegram&utm_medium=social&utm_campaign=hr_private_ai

NodeShift Cloud

How HR Teams Can Automate Internal Policies, Job Descriptions, & Memos Using a Fully Private AI Assistant

HR teams spend a large part of their week writing. New policies. Updated job descriptions. Internal memos. Meeting summaries. Even small announcements take time when you repeat the same steps every day. Most HR departments want help, but they cannot use public…

203 views10:40

NodeShift Announcements Official

Most organisations want to adopt GenAI, but can’t use it.
Why? Because public AI tools transfer your data across borders, often into multiple regions you can’t control.
For teams handling confidential, regulated or government data, that’s a hard NO.

So how do you adopt Generative AI without violating data residency laws?

>> The answer: On-Premise GenAI.

AI that runs fully inside your own servers or private cloud, under your firewall, with zero data leaving your environment.

Countries across the GCC - Saudi Arabia, UAE, Qatar, Bahrain, Oman, Kuwait - follow strict residency laws. Traditional cloud-based AI simply doesn’t comply.

That’s exactly where NodeShift AI steps in. NodeShift AI helps organisations deploy a fully private, on-prem GenAI stack where:
- No prompts leave your network
- No embeddings are stored outside
- No logs or metadata are exposed
- All inference stays inside your own infrastructure

In our latest article, we break down:
🔷 What most companies get wrong when trying to deploy on-prem GenAI
🔷 Architecture patterns that actually work
🔷 How to stay compliant with country-level data residency laws
🔷 How NodeShift AI provides a fully private, production-ready GenAI environment

If your org is exploring private AI, this guide is a solid entrypoint.

🔗 Read here: https://nodeshift.cloud/blog/how-to-deploy-generative-ai-on-prem-without-violating-data-residency-laws?utm_source=telegram&utm_medium=social&utm_campaign=genai_onprem_article

NodeShift Cloud

How to Deploy Generative AI On-Prem Without Violating Data Residency Laws

Most organisations want to use modern Generative AI, but many of them can’t use public AI tools. The reason is simple: these tools process data outside the country, outside the organisation’s control, and sometimes even across multiple global regions. For…

🔥2

311 views10:23

NodeShift Announcements Official

Banks process more documents than almost any other industry - credit memos, risk assessments, KYC files, compliance reports, audit responses, and the list goes on.
Yet most of this is still manual.
- Data copied between systems
- Reports reviewed line by line
- The same formats rewritten again and again

And because public AI tools send data outside the organisation, banks simply cannot use them. The compliance risk is too high, in many cases, outright prohibited.
So how do you modernise without breaking rules?
On-prem Large Language Models (LLMs) change everything. A secure, private enterprise AI platform like NodeShift AI allows banks to automate documentation, analysis, and decision workflows - all inside their own environment, with zero data leaving the premises.

In our latest article, we break down:
- How banks can safely use on-prem LLMs for credit memos, risk reviews & KYC
- Why public AI tools pose compliance risks
- How platforms like NodeShift AI make enterprise-grade private AI possible
- If you work in BFSI, compliance, or AI transformation, this is a must-read

🔗 Read the full article: https://nodeshift.cloud/blog/how-banks-can-automate-credit-memos-risk-reviews-kyc-with-fully-compliant-on-prem-ai?utm_source=telegram&utm_medium=social&utm_campaign=onprem_banking_ai

🔥3❤1

319 views10:21

About

Blog

Apps

Platform