Alaid TechThread

MCP Scanner

A Python tool for scanning MCP (Model Context Protocol) servers and tools for potential security findings. The MCP Scanner combines Cisco AI Defense inspect API, YARA rules and LLM-as-a-judge to detect malicious MCP tools.

https://github.com/cisco-ai-defense/mcp-scanner

GitHub

GitHub - cisco-ai-defense/mcp-scanner: Scan MCP servers for potential threats & security findings.

Scan MCP servers for potential threats & security findings. - cisco-ai-defense/mcp-scanner

397 views15:33

Alaid TechThread

Out Of Control: How KCFG and KCET Redefine Control Flow Integrity in the Windows Kernel

https://youtu.be/LflYlvJ4vSU

YouTube

Out Of Control: How KCFG and KCET Redefine Control Flow Integrity in the Windows Kernel

Virtual Secure Mode, or VSM, on Windows marked the most significant leap in security innovation in quite some time, allowing the hypervisor to provide unprecedented protection to the Windows OS. With VSM features like Credential Guard, preventing in-memory…

440 views17:56

Alaid TechThread

Model Context Protocol (MCP) Security

https://github.com/cosai-oasis/ws4-secure-design-agentic-systems/blob/mcp/model-context-protocol-security.md#model-context-protocol-mcp-security

GitHub

ws4-secure-design-agentic-systems/model-context-protocol-security.md at mcp · cosai-oasis/ws4-secure-design-agentic-systems

Repository for CoSAI Workstream 4, Secure Design Patterns for Agentic Systems - cosai-oasis/ws4-secure-design-agentic-systems

🔥3211

352 views08:00

Alaid TechThread

VulHunt: A High-Level Look at Binary Vulnerability Detection

https://www.binarly.io/blog/vulhunt-intro

www.binarly.io

Introducing VulHunt: A High-Level Look at Binary Vulnerability Detection

Software and its supply chain are becoming increasingly complex, with developers relying more and more on third-party frameworks and libraries to ship fast and fulfill business objectives. While the prevalence of such libraries and frameworks is a boon, allowing…

👍2🔥1

595 views18:32

Alaid TechThread

Атакующий ИИ на практике. Nulla на T-Sync Conf 7 февраля 2026 года.

На конференции показываем Nulla — атакующего ИИ-агента, который думает и действует как реальный хакер.

Что будем делать на стенде Nulla:

🔍 Анализ репозиториев на реальные уязвимости

🔗 Разбор API-контрактов и логики взаимодействий

🚨 Заведение и оценка уязвимостей

👨‍💻 Общение с командой разработки

Nulla не просто подсвечивает потенциальные проблемы, а предоставляет PoV (Proof of Vulnerability), показывая, что уязвимость реально эксплуатируема.

Почему это важно:

🧠 масштабирование экспертизы ИБ без роста штата

📚 единая база знаний, собранная практикующими экспертами

📏 меньше субъективных оценок → единый стандарт качества

Атакующий ИИ меняет саму суть работы инженера ИБ:
фокус смещается с рутины и покрытия — на качество анализа, мышление и развитие экспертизы агента.

📍 Ждём на стенде Security → https://t-syncconf.ru/program?category=Security
Также в программе: ассистент исправления уязвимостей Safeliner, платформа управления безопасностью активов Diameter, лекторий от SolidLab.

🔥10👍22

803 viewsedited 10:46

Alaid TechThread

https://github.com/lukehinds/nono

nono is a secure, kernel-enforced capability shell for running untrusted AI agents and processes. Unlike policy-based sandboxes that intercept and filter operations, nono leverages OS security primitives (Landlock on Linux, Seatbelt on macOS) to create an environment where unauthorized operations are structurally impossible.

GitHub

GitHub - always-further/nono: Secure, kernel-enforced sandbox for AI agents, MCP and LLM workloads. Capability-based isolation…

Secure, kernel-enforced sandbox for AI agents, MCP and LLM workloads. Capability-based isolation with secure key management and blocking of destructive actions in a zero-trust environment. - always...

328 views08:31

Alaid TechThread

Alaid TechThread pinned a photo

08:31

Alaid TechThread

Статья от AISLE — отличный пример того, как AI-решения перестают быть лабораторными прототипами и начинают реально менять индустрию.

• Реальные цели: ИИ нашел десятки уязвимостей в OpenSSL, curl, Linux kernel, Chromium, Firefox, которые годами пропускали люди.
• Качество vs Количество: Пока одни модели заваливают разработчиков «мусорными» отчетами (AI Slop), продвинутые системы находят сложные логические дыры.
• Внедрение в процесс: Теперь ИИ проверяет код OpenSSL прямо в Pull Request — баги ловят еще до того, как они попадут в релиз. все 12 0-day из релиза OpenSSL найдены их AI

Ценность AI в безопасности — не в количестве находок, а в полном замкнутом цикле работе с уязвимостями.

https://aisle.com/blog/what-ai-security-research-looks-like-when-it-works

AISLE

What AI Security Research Looks Like When It Works

What a year of finding zero-days in OpenSSL, curl, and the Linux kernel taught us about AI-driven security research done right.

277 views08:25

Alaid TechThread

Sift or Get Off the PoC: Applying Information Retrieval to Vulnerability Research with SiftRank

https://github.com/noperator/siftrank

https://arxiv.org/pdf/2512.06155

GitHub

GitHub - noperator/siftrank: Use LLMs to rank anything.

Use LLMs to rank anything. Contribute to noperator/siftrank development by creating an account on GitHub.

🔥3

231 views22:01

Alaid TechThread

Worlds: A Simulation Engine for Agentic Pentesting

Авторы смогли обучить модель (~8B параметров), которая прошла путь с нуля до полной компрометации домена в GOAD-бенчмарке, используя только синтетические данные.

https://dreadnode.io/blog/worlds-a-simulation-engine-for-agentic-pentesting

dreadnode.io

Worlds: A Simulation Engine for Agentic Pentesting

An 8B model went from blindly loading Metasploit modules to achieving Domain Admin on GOAD, trained entirely on synthetic data from our world model system.

🔥51

387 views18:53

Alaid TechThread

Introducing BinaryAudit

Бенчмарк BinaryAudit предлагает задачи, где AI-агенту дают компилированный исполняемый файл, в котором:
- скрыт искусственно внедрённый бэкдор
- нет исходников и отладочных символов
Задача — определить, есть ли в бинарнике бэкдор и где он находится.

https://quesma.com/blog/introducing-binaryaudit/

Quesma

We hid backdoors in binaries — Opus 4.6 found 49% of them - Quesma Blog

BinaryAudit benchmarks AI agents using Ghidra to find backdoors in compiled binaries of real open-source servers, proxies, and network infrastructure.

🔥5

246 views04:25

Alaid TechThread

AI Cyber Model Arena — это практический бенчмарк от Wiz Research, предназначенный для оценки способностей AI‑агентов в задачах кибербезопасности, особенно в наступательных сценариях.

Всего 257 задач в категориях:
- эксплойты для API/веб‑приложений
- уязвимости «нулевого дня» и известные CVE
- сценарии в облачных средах (AWS, Azure, GCP, Kubernetes)

На сайте представлены результаты оценки 25 комбинаций модель-агент. В лидерах по pass@3 Opus 4.6 + Claude Code.

https://www.wiz.io/blog/introducing-ai-cyber-model-arena-a-real-world-benchmark-for-ai-agents-in-cybersec

wiz.io

AI Cyber Model Arena: Testing AI Agents in Cybersecurity | Wiz Blog

AI Cyber Model Arena benchmarks AI agents across 257 real-world security challenges spanning zero-days, CVEs, API, web, and cloud security.

🔥43

298 views14:53

Alaid TechThread

Еще один бенчмарк от HTB. Глобально ничего нового, простой агент без специального тулинга. Подобных сравнений за прошлый год можно найти десятки.

Выводов 2:
- базовых агентов (типа Claud code) достаточно для решения задач начального уровня. С развитием моделей прогресс будет расти.
- для качественного решения более сложных задач «просто взять топовую модель» недостаточно.

https://www.hackthebox.com/blog/ai-range-llm-security-benchmark

Hack The Box

Benchmarking LLMs for cybersecurity: Inside HTB AI Range’s first evaluation

Discover how Hack The Box AI Range benchmarks LLMs in realistic cyber scenarios. Explore the methodology, key findings, and why it sets a new standard for AI security performance.

175 views22:01

Некоторые материалы про Nulla с конференции T-Sync доступны в блоге

2🔥3👍11

180 viewsedited 22:13

Alaid TechThread