NodeShift Announcements Official
22.8K subscribers
45 photos
7 videos
378 links
Decentralized, no-code AI cloud platform that enables one-click deployment of AI agents and LLMs
Download Telegram
Datalab just released their next-generation OCR model — Chandra!

Chandra is a powerful vision-language OCR model built for precise document understanding. It doesn’t just extract text — it reconstructs full document layouts into clean Markdown, HTML, or JSON formats, handling tables, forms, diagrams, handwriting, math equations, and multi-column pages with ease.

Supporting over 40 languages, Chandra achieves an impressive 83.1% overall accuracy on the olmOCR benchmark, outperforming many open and commercial OCR systems.

We’ve just published a comprehensive guide that walks you through everything — from setting up Chandra on a GPU-powered NodeShift Cloud VM, installing dependencies, and running the model with Transformers and vLLM, to launching a full Streamlit web app for interactive document analysis in the browser.

Whether you’re a researcher, developer, or just passionate about document AI, this guide will help you get Chandra running end-to-end — from terminal to web UI.

Check out the full guide here: https://nodeshift.cloud/blog/how-to-install-run-chandra-ocr-locally
1🔥1
Baidu's PaddleOCR-VL is the new SOTA vision-language model redefining document understanding and trending as one of the top OCR models along with big models like DeepSeek OCR.
This is a compact yet insanely capable OCR-VLM that blends:
- NaViT-style dynamic visual encoding
- ERNIE-4.5-0.3B language model
- Support for 109 languages
- Lightning-fast, resource-efficient inference

It doesn’t just read documents, it understands and explains them. From complex tables and formulas to multi-lingual text and charts, PaddleOCR-VL achieves state-of-the-art accuracy while staying lightweight enough for real-world deployment.
At NodeShift, we made it even easier to install, run, and benchmark PaddleOCR-VL locally, so you can experience its power without the complex setup friction.
🔗 Read here: https://nodeshift.cloud/blog/how-to-install-and-run-paddleocr-vl-locally?utm_source=telegram&utm_medium=social&utm_campaign=paddleocr-vl-launch
Kimi Linear by Moonshot AI is the Future of Scalable Attention!
Imagine handling 1 million tokens with 6× faster decoding and 75% less memory, that’s what Kimi Linear delivers.

Built on the groundbreaking Kimi Delta Attention (KDA), it redefines how we process long-context data with unmatched speed, efficiency, and precision.

In our latest guide, we break down how to install and run Kimi Linear locally so you can experience next-gen attention models firsthand, right from your own setup. If you're into LLM research, RL-style reasoning, or long-context applications, this one’s a must-try.
🔗 Read full detailed article here: https://nodeshift.cloud/blog/a-step-by-step-guide-to-install-run-kimi-linear?utm_source=telegram&utm_medium=social&utm_campaign=kimi_linear_launch
3
Meet JanusCoderV-8B — the next leap in visual-programmatic intelligence!

Developed by InternLM, JanusCoderV-8B is an 8-billion-parameter multimodal model built on InternVL-3.5-8B, trained on the massive JANUSCODE-800K corpus. It’s designed to unify vision and code, enabling image-conditioned code generation, visually grounded edits, and UI-to-code translation — all in one model.

What makes it special?
It bridges the gap between visual context and programmatic logic.
Generates HTML/CSS, charts, and interactive elements directly from screenshots or design mockups.
Supports long-context outputs (up to 32K tokens) and runs smoothly on affordable GPUs using 8-bit or BF16 precision.

We’ve just published a new step-by-step guide:

How to Install & Run JanusCoderV-8B Locally — a complete walkthrough that covers:
Setting up a GPU-powered VM on NodeShift Cloud
Installing CUDA 12.1.1, Python 3.11, and PyTorch 2.5.1
Configuring the environment for multimodal inference
Running JanusCoderV-8B to generate image-based code and UI descriptions

Read the full guide here: https://nodeshift.cloud/blog/how-to-install-run-januscoderv-8b-locally
🔥1
AMD Releases Nitro-E — A Lightweight Text-to-Image Diffusion Model

AMD has introduced Nitro-E, a highly efficient text-to-image diffusion model built on the E-MMDiT architecture (~304M parameters). It’s designed for fast, low-cost training and inference, making image generation accessible even on modest GPU setups.

Key Highlights:
🔹 Ultra-light architecture (~304M params) with EMMDiT backbone.
🔹 Base 512px model delivers quality in ~20 steps.
🔹 Distilled 512px variant generates great results in just 4 steps.
🔹 GRPO-tuned checkpoint for improved post-training image quality.
🔹 Fully compatible with both NVIDIA (CUDA) and AMD (ROCm) GPUs.

We’ve just published a complete step-by-step guide covering everything you need to install and run AMD Nitro-E locally.

Inside this guide, you’ll learn:
How to set up a GPU VM on NodeShift Cloud.
Environment setup with Python 3.11, CUDA 12.1, and required libraries.
Installation of PyTorch, Diffusers, and FlashAttention.
Running Nitro-E in multiple modes — base, distilled, and GRPO-tuned.
Generating your first AI image in just a few minutes.

Read the full guide here: https://nodeshift.cloud/blog/how-to-install-run-amd-nitro-e-locally
2🔥1
Media is too big
VIEW IN TELEGRAM
OpenAI just released GPT-OSS-Safeguard — a new era of open-source safety reasoning!

OpenAI’s GPT-OSS-Safeguard 20B and 120B models are purpose-built for Trust & Safety, trained to interpret your own policy text, explain moderation decisions, and let you control reasoning effort (low / medium / high).

🔹 20B → optimized for 16 GB-class GPUs, perfect for low-latency filters & offline labeling
🔹 120B → high-fidelity safety reasoning, fits a single H100 80 GB via MoE + MXFP4 quantization
🔹 Fully open-weight under Apache 2.0, built on the GPT-OSS family
🔹 Requires the Harmony response format for interpretable

We’ve just published a step-by-step guide covering everything you need to:
Deploy GPT-OSS-Safeguard Models
Pull & run the models via Ollama CLI
Launch Open WebUI for a visual chat experience
Explore reasoning depth, labeling workflows, and real-world policy checks

Read the full guide here: https://nodeshift.cloud/blog/how-to-install-run-gpt-oss-safeguard-20b-and-120b-locally
2
Meet VieNeu-TTS, the first-ever Vietnamese Text-to-Speech model that runs entirely on your personal device.
Built on the Qwen 0.5B LLM and fine-tuned from NeuTTS Air, it delivers hyper-realistic, natural voices with real-time inference, no heavy hardware dependency.
If you’re building voice assistants, educational tools, or offline AI applications, VieNeu-TTS is a game-changer for anyone who values speed, privacy, and quality.
In this step-by-step guide, we show you how to install and run VieNeu-TTS locally, and experience the future of Vietnamese voice AI right away.
🔗 Read the full article here: https://nodeshift.cloud/blog/how-to-install-run-vieneu-tts-locally-the-first-realistic-vietnamese-voice-ai?utm_source=telegram&utm_medium=social&utm_campaign=vieneu_tts_launch
👍2
SoulX-Podcast-1.7B — Long-Form, Multi-Speaker TTS Is Here!

SoulX-Podcast-1.7B is a podcast-style TTS model built for long, multi-turn, multi-speaker dialogs. It supports English, Mandarin, and several Chinese dialects (Sichuanese, Henanese, Cantonese), performs zero-shot voice cloning from short clips, and even captures laughter, sighs, and emotions to make speech sound real.

It’s optimized for single-GPU inference, letting you generate entire podcast episodes with expressive delivery, natural tone, and dynamic speaker changes.

Key Highlights:
Long-form conversational TTS with speaker variation
Zero-shot cloning from short reference clips
Multilingual + multi-dialect support
Runs smoothly on 8–24 GB GPUs for smaller use cases
Perfect for podcasts, storytelling, or research in expressive speech

We just published a new step-by-step guide covering:
Complete NodeShift GPU setup (CUDA 12.1.1 devel image)
Python 3.11 + Conda environment setup
Installing PyTorch (cu121) and all dependencies
Pulling base + dialect models from Hugging Face
Running dialogue inference scripts
Launching the Gradio WebUI for real-time podcast generation

Read the full guide here: https://nodeshift.cloud/blog/how-to-install-run-soulx-podcast-1-7b-locally
2
The Future of Expressive AI Voice Generation is here - Meet Maya1 by Maya Research
Imagine a voice model that can laugh, cry, whisper, or sigh, all from a single text prompt describing the type of voice you want.
That’s exactly what Maya1 by Maya Research has to offer - a 3B-parameter speech model built for rich emotional realism, natural language voice design, and real-time streaming using the SNAC neural codec.
With NodeShift cloud you can run it locally on a single GPU, open-source under Apache 2.0.

In our new article, we walk you through how to install and run Maya1 locally, so you can start crafting lifelike, emotionally aware AI voices for your own projects from podcasts to storytelling to research.

🔗 Dive in now → https://nodeshift.cloud/blog/how-to-install-and-run-maya1-locally-create-emotion-rich-ai-voices-in-minutes?utm_source=telegram&utm_medium=social&utm_campaign=maya1_launch&utm_content=blog_post
🔥21
Released just days ago, Aquif 3.5 Plus and Aquif 3.5 Max bring GPT-5-level intelligence, with next-generation reasoning power, and massive 1M-token context windows - all that can be easily run locally with NodeShift.

With hybrid reasoning modes, 3.3B active parameters, and multilingual support, Aquif 3.5 lets you toggle between speed and depth, from lightning-fast inference to deep scientific analysis.
And with NodeShift Cloud, you can deploy and run it effortlessly on your own hardware or custom GPU instances in minutes.

🔗 Read the full guide here: https://nodeshift.cloud/blog/how-to-install-aquif-3-5-plus-max-locally-the-open-source-models-with-gpt-5-level-reasoning?utm_source=telegram&utm_medium=social&utm_campaign=aquif3-5plusmax_blog
3🔥1
You can now run Kimi K2 Thinking - the SOTA and most powerful open-source thinking agent model to date - fully locally!

This isn't just another chat model.
Kimi K2 Thinking by Moonshot AI can reason step-by-step, plan tasks, write code, and autonomously call tools - sustaining 200–300 sequential actions without losing direction.
And with the new GGUF quantized build by Unsloth AI, the massive 1T parameter model (1.09TB) is reduced to ~230GB - while retaining its deep reasoning performance. Meaning: you can actually run it with just a handful of H100s/H200s.

We just published a full guide showing how to install and run Kimi K2 Thinking locally:
• Setup requirements
• Setting up your local/NodeShift GPU environment
• Download & run GGUF with Llama.cpp
• Inference code for reasoning

If you're building autonomous agents, research copilots, or coding assistants, this is one model you’ll definitely want to try.

🔗 Read the full guide here: https://nodeshift.cloud/blog/how-to-install-and-run-kimi-k2-thinking-gguf-locally?utm_source=telegram&utm_medium=social&utm_campaign=kimi-k2-gguf-guide
🔥2
AI at Meta just dropped Omnilingual ASR – an open-source speech recognition model suite built to support 1,600+ languages, including many that have never had reliable ASR support before.

This family combines Wav2Vec2, CTC, and LLM-based architectures to deliver:
Scalable zero-shot transcription for new & low-resource languages
State-of-the-art accuracy with the flagship omniASR_LLM_7B (CER < 10% for ~80% of supported languages)
Easy integration with PyTorch, Fairseq2, and Hugging Face for real-world deployments

We’ve just published a new hands-on guide on how to run Omnilingual ASR on a GPU-powered VM using NodeShift Cloud.

What this guide covers:
GPU configuration recommendations for all Omnilingual ASR model variants
Step-by-step setup on a NodeShift GPU VM (Python 3.11, CUDA, libsndfile, dependencies)
Installing & running omniASR_LLM_7B with the official ASRInferencePipeline
A ready-to-use Gradio WebUI to upload audio (≤40s) and get instant multilingual transcripts
Practical workflow tips for teams, researchers & community language projects

Read the full guide here: https://nodeshift.cloud/blog/how-to-install-run-omnilingual-asr-locally
1
What if you could cut your AI costs in half - just by switching to this new JSON alternative?

Meet TOON (Token-Oriented Object Notation), the smarter, token-optimized alternative to JSON.
It’s built to make your data lighter, faster, and more LLM-friendly, reducing token usage by up to 60% while keeping everything perfectly structured.

With NodeShift Cloud, you can easily test, deploy, and scale TOON-based workflows while further cutting your costs with affordable compute and turning AI token efficiency into real-world performance and savings.

Get Smarter data. Cheaper AI. Faster results.
🔗 Read the full article: https://nodeshift.cloud/blog/how-to-cut-your-ai-costs-in-half-with-toon-the-smarter-token-optimized-alternative-to-json?utm_source=telegram&utm_medium=social&utm_campaign=toon_launch&utm_content=post
👍2
SAP releases its new open-source model — SAP-RPT-1-OSS (formerly ConTextTab)!

SAP-RPT-1-OSS, a table-native, semantics-aware in-context learner for classification and regression tasks. This model brings deep semantic understanding to tabular data by embedding column names and cell values, handling missing data automatically, and scaling performance with context size and bagging.

What makes SAP-RPT-1-OSS special
Trained on real-world tabular datasets (not just synthetic data).
Combines semantic alignment with tabular in-context learning.
Requires no manual preprocessing — simply feed your DataFrame or NumPy array.
Supports both classification and regression natively.
Runs efficiently on a single GPU while maintaining state-of-the-art accuracy.

We’ve just released a comprehensive setup guide that walks you through:
Deploying a GPU-powered NodeShift Virtual Machine
Installing Python 3.11, dependencies, and authenticating with Hugging Face
Cloning and running the SAP-RPT-1-OSS repository
Running a sample classification test to verify everything works

Read the full tutorial here: https://nodeshift.cloud/blog/how-to-install-run-sap-rpt-1-oss-locally
2
What if an AI could not only see an image but also think about it, like reasoning through data charts, solving STEM problems from photos, and finding interesting insights just by looking at your pictures?

That’s exactly what ERNIE-4.5-VL Thinking does.

Built on Baidu’s powerful 28B-parameter architecture, it brings deep multimodal reasoning, fine-grained grounding, and tool-assisted visual understanding to life - all while running efficiently with just 3B active parameters.

In our latest article, we break down how to install and run ERNIE 4.5 VL, Thinking, and how NodeShift Cloud makes deploying this next-gen model effortless for developers building visual-language agents and AI reasoning systems.

🔗 Read the full guide here: https://nodeshift.cloud/blog/how-to-install-and-run-ernie-4-5-vl-thinking-locally?utm_source=telegram&utm_medium=social&utm_campaign=ernie4-5-vl-thinking
🔥1
Every week, HR is forced to spend hours rewriting the same things:
→ New policies
→ Updated job descriptions
→ Internal memos
→ Meeting notes
→ Small announcements that somehow take too long

But HR can’t use public AI tools like everyone else.
Why? Too much sensitive data. Too many risks. Too many compliance blockers.

All the employee data, salaries, performance notes, and internal decisions must stay inside the organisation.
That’s where a fully private, on-prem AI assistant changes everything.

In our latest article, we break down how a platform like NodeShift AI help HR teams securely automate daily workflows with:
- Faster policy drafting
- Accurate job description generation
- Instant memo + meeting summary creation
- Zero data leaving the organisation
- Support for 140+ enterprise-grade AI models (ChatGPT, Claude, Gemini, Mistral, LLaMA, DeepSeek & more)

If you’re in HR or People Ops, this is a must-read.
🔗 Read the full article here: https://nodeshift.cloud/blog/how-hr-teams-can-automate-internal-policies-job-descriptions-memos-using-a-fully-private-ai-assistant?utm_source=telegram&utm_medium=social&utm_campaign=hr_private_ai
Most organisations want to adopt GenAI, but can’t use it.
Why? Because public AI tools transfer your data across borders, often into multiple regions you can’t control.
For teams handling confidential, regulated or government data, that’s a hard NO.

So how do you adopt Generative AI without violating data residency laws?

>> The answer: On-Premise GenAI.

AI that runs fully inside your own servers or private cloud, under your firewall, with zero data leaving your environment.

Countries across the GCC - Saudi Arabia, UAE, Qatar, Bahrain, Oman, Kuwait - follow strict residency laws. Traditional cloud-based AI simply doesn’t comply.

That’s exactly where NodeShift AI steps in. NodeShift AI helps organisations deploy a fully private, on-prem GenAI stack where:
- No prompts leave your network
- No embeddings are stored outside
- No logs or metadata are exposed
- All inference stays inside your own infrastructure

In our latest article, we break down:
🔷 What most companies get wrong when trying to deploy on-prem GenAI
🔷 Architecture patterns that actually work
🔷 How to stay compliant with country-level data residency laws
🔷 How NodeShift AI provides a fully private, production-ready GenAI environment

If your org is exploring private AI, this guide is a solid entrypoint.

🔗 Read here: https://nodeshift.cloud/blog/how-to-deploy-generative-ai-on-prem-without-violating-data-residency-laws?utm_source=telegram&utm_medium=social&utm_campaign=genai_onprem_article
🔥2
Banks process more documents than almost any other industry - credit memos, risk assessments, KYC files, compliance reports, audit responses, and the list goes on.
Yet most of this is still manual.
- Data copied between systems
- Reports reviewed line by line
- The same formats rewritten again and again

And because public AI tools send data outside the organisation, banks simply cannot use them. The compliance risk is too high, in many cases, outright prohibited.
So how do you modernise without breaking rules?
On-prem Large Language Models (LLMs) change everything. A secure, private enterprise AI platform like NodeShift AI allows banks to automate documentation, analysis, and decision workflows - all inside their own environment, with zero data leaving the premises.

In our latest article, we break down:
- How banks can safely use on-prem LLMs for credit memos, risk reviews & KYC
- Why public AI tools pose compliance risks
- How platforms like NodeShift AI make enterprise-grade private AI possible
- If you work in BFSI, compliance, or AI transformation, this is a must-read

🔗 Read the full article: https://nodeshift.cloud/blog/how-banks-can-automate-credit-memos-risk-reviews-kyc-with-fully-compliant-on-prem-ai?utm_source=telegram&utm_medium=social&utm_campaign=onprem_banking_ai
🔥31