AISecHub

Implementing Secure AI Framework Controls in Google Cloud - New Version https://services.google.com/fh/files/misc/ociso_2025_saif_cloud_paper.pdf

ociso_2025_saif_cloud_paper.pdf

1.1 MB

449 views06:50

Skynet Starter Kit - https://media.ccc.de/v/39c3-skynet-starter-kit-from-embodied-ai-jailbreak-to-remote-takeover-of-humanoid-robots#t=26 - CCC

From Embodied AI Jailbreak to Remote Takeover of Humanoid Robots

We present a comprehensive security assessment of Unitree's robotic ecosystem. We identified and exploited multiple security flaws across multiple communication channels, including Bluetooth, LoRa radio, WebRTC, and cloud management services. Besides pwning multiple traditional binary or web vulnerabilities, we also exploit the embodied AI agent in the robots, performing prompt injection and achieve root-level remote code execution. Furthermore, we leverage a flaw in cloud management services to take over any Unitree G1 robot connected to the Internet. By deobfuscating and patching the customized, VM-based obfuscated binaries, we successfully unlocked forbidden robotic movements restricted by the vendor firmware on consumer models such as the G1 AIR. We hope our findings could offer a roadmap for manufacturers to strengthen robotic designs, while arming researchers and consumers with critical knowledge to assess security in next-generation robotic systems.

We present a comprehensive security assessment of Unitree's robotic ecosystem. We identified and exploited multiple security flaws across...

Skynet Starter Kit

1.01K viewsedited 07:08

39C3 - AI-generated content in Wikipedia - a tale of caution

AI-generated content in Wikipedia - https://www.youtube.com/watch?v=fKU0V9hQMnY by @presroi

I successfully failed with a literature related project and accidentally built a ChatGPT detector. Then I spoke to the people who uploaded ChatGPT generated content on Wikipedia.

It began as a standard maintenance project: I wanted to write a tool to find and fix broken ISBN references in Wikipedia. Using the built-in checksum, this seemed like a straightforward technical task. I expected to find mostly typos. But I also found texts generated by LLMs. These models are effective at creating plausible-sounding content, but (for now) they often fail to generate correct checksums for identifiers like ISBNs.

This vulnerability turned my tool into an unintentional detector for this type of content. This talk is the story of that investigation. I'll show how the tool works and how it identifies this anti-knowledge. But the tech is only half the story. The other half is human. I contacted the editors who had added this undeclared AI content. I will talk about why they did it and how the Wikipedians reacted and whether "The End is Nigh" calls might be warranted.

YouTube

https://media.ccc.de/v/39c3-ai-generated-content-in-wikipedia-a-tale-of-caution

I successfully failed with a literature related project and accidentally built a ChatGPT detector. Then I spoke to the people who uploaded ChatGPT generated content on Wikipedia.…

190 views08:14

AI Agent, AI Spy - https://media.ccc.de/v/39c3-ai-agent-ai-spy#t=246

Agentic AI is the catch-all term for AI-enabled systems that propose to complete more or less complex tasks on their own, without stopping to ask permission or consent. What could go wrong?

These systems are being integrated directly into operating systems and applications, like web browsers. This move represents a fundamental paradigm shift, transforming them from relatively neutral resource managers into an active, goal-oriented infrastructure ultimately controlled by the companies that develop these systems, not by users or application developers.

Systems like Microsoft's "Recall," which create a comprehensive "photographic memory" of all user activity, are marketed as productivity enhancers, but they function as OS-level surveillance and create significant privacy vulnerabilities. In the case of Recall, we’re talking about a centralized, high-value target for attackers that poses an existential threat to the privacy guarantees of meticulously engineered applications like Signal. This shift also fundamentally undermines personal agency, replacing individual choice and discovery with automated, opaque recommendations that can obscure commercial interests and erode individual autonomy.

This talk will review the immediate and serious danger that the rush to shove agents into our devices and digital lives poses to our fundamental right to privacy and our capacity for genuine personal agency. Drawing from Signal's analysis, it moves beyond outlining the problem to also present a "tourniquet" solution: looking at what we need to do *now* to ensure that privacy at the application layer isn’t eliminated, and what the hacker community can do to help. We will outline a path for ensuring developer agency, granular user control, radical transparency, and the role of adversarial research.

Agentic AI is the catch-all term for AI-enabled systems that propose to complete more or less complex tasks on their own, without stoppin...

AI Agent, AI Spy

188 viewsedited 08:56

Data that is too dirty for "AI" - https://media.ccc.de/v/39c3-a-media-almost-archaeology-on-data-that-is-too-dirty-for-ai#t=202 by #jiawenuffline

In 1980s, non-white women’s body size data was categorized as dirty data when establishing the first women's sizing system in US. Now in the age of GPT, what is considered as dirty data and how are they removed from massive training materials?

Datasets nowadays for training large models have been expanded to the volume of (partial) internet, with the idea of “scale averages out noise”, these datasets were scaled up by scrabbling whatever available data on the internet for free then “cleaned” with a human-not-in-the-loop, cheaper-than-cheap-labor method: heuristic filtering. Heuristics in this context are basically a set of rules came up by the engineers with their imagination and estimation that are “good enough” to remove “dirty data” of their perspective, not guaranteed to be optimal, perfect, or rational.

The talk will show some intriguing patterns of “dirty data” from 23 extraction-based datasets, like how NSFW gradually equals to NSFTM (not safe for training model), and reflect on these silent, anonymous yet upheld estimations and not-guaranteed rationalities in current sociotechnical artifacts, and ask for whom these estimations are good-enough, as it will soon be part our technological infrastructures.

a media-almost-archaeology on data that is too dirty for "AI"

when datasets are scaled up to the volume of (partial) internet, together with the idea that scale will average out the noise, large dat...

179 views10:07

Breaking BOTS: Cheating at Blue Team CTFs with AI Speed-Runs - CCC
https://media.ccc.de/v/39c3-breaking-bots-cheating-at-blue-team-ctfs-with-ai-speed-runs#t=93

After we announced our results, CTFs like Splunk's Boss of the SOC (BOTS) started prohibiting AI agents. For science & profit, we keep doing it anyways. In BOTS, the AIs solve most of it in under 10 minutes instead of taking the full day. Our recipe was surprisingly simple: Teach AI agents to self-plan their investigation steps, adapt their plans to new information, work with the SIEM DB, and reason about log dumps. No exotic models, no massive lab budgets - just publicly available LLMs mixed with a bit of science and perseverance. We'll walk through how that works, including videos of the many ways AI trips itself up that marketers would rather hide, and how to do it at home with free and open-source tools.

CTF organizers can't detect this - the arms race is probably over before it really began. But the real question isn't "can we cheat at CTFs?" It's what happens when investigations evolve from analysts-who-investigate to analysts-who-manage-AI-investigators. We'll show you what that transition already looks like today and peek into some uncomfortable questions about what comes next.

Breaking BOTS: Cheating at Blue Team CTFs with AI Speed-Runs

After we announced our results, CTFs like Splunk's Boss of the SOC (BOTS) started prohibiting AI agents. For science & profit, we keep do...

202 views10:11

VulnLLM-R: Specialized Reasoning LLM for Vulnerability Detection - https://github.com/ucsb-mlsec/VulnLLM-R

Through extensive experiments on SOTA datasets across Python, C/C++, and Java, we show that VulnLLM-R has superior effectiveness and efficiency than SOTA static analysis tools and both open-source and commercial large reasoning models.

We further conduct a detailed ablation study to validate the key designs in our training recipe. Finally, we construct an agent scaffold around our model and show that it outperforms CodeQL and AFL++ in realworld projects. Our agent further discovers a set of zero-day vulnerabilities in actively maintained repositories. This work represents a pioneering effort to enable real-world, project-level vulnerability detection using AI agents powered by specialized reasoning models.

More info: https://arxiv.org/pdf/2512.07533

GitHub - ucsb-mlsec/VulnLLM-R

Contribute to ucsb-mlsec/VulnLLM-R development by creating an account on GitHub.

186 viewsedited 14:32

Cupcake - Make AI agents follow the rules - https://github.com/eqtylab/cupcake

Cupcake intercepts agent events and evaluates them against user-defined rules written in Open Policy Agent (OPA) Rego.

Agent actions can be blocked, modified, and auto-corrected by providing the agent helpful feedback. Additional benefits include reactive automation for tasks you dont need to rely on the agent to conduct (like linting after a file edit).

Why Cupcake?

Modern agents are powerful but inconsistent at following operational and security rules, especially as context grows. Cupcake turns the rules you already maintain (e.g., http://CLAUDE.md, http://AGENT.md, .cursor/rules) into enforceable guardrails that run before actions execute.

- Multi-harness support with first‑class integrations for Claude Code, Cursor, Factory AI, and OpenCode.

- Governance‑as‑code using OPA/Rego compiled to WebAssembly for fast, sandboxed evaluation.

- Enterprise‑ready controls: allow/deny/review, enriched audit trails for AI SOCs, and proactive warnings.

GitHub - eqtylab/cupcake: A native policy enforcement layer for AI coding agents. Built on OPA/Rego.

A native policy enforcement layer for AI coding agents. Built on OPA/Rego. - eqtylab/cupcake

204 views14:32

Async Control: Stress-testing Asynchronous Control Measures for LLM Agents - https://github.com/UKGovernmentBEIS/async-control

Accompanying code for Async Control: Stress-testing Asynchronous Control Measures for LLM Agents paper

https://ukgovernmentbeis.github.io/async-control/

Simulation: https://ukgovernmentbeis.github.io/async-control/macrogame.html
Paper: https://arxiv.org/abs/2512.13526

GitHub - UKGovernmentBEIS/async-control: Accompanying code for Async Control: Stress-testing Asynchronous Control Measures for…

Accompanying code for Async Control: Stress-testing Asynchronous Control Measures for LLM Agents paper - UKGovernmentBEIS/async-control

224 views14:35

ARES-Dashboard - AI Red Team Operations Console https://github.com/Arnoldlarry15/ARES-Dashboard

Demo: https://ares-dashboard-mauve.vercel.app/

ARES is an AI Red Team Operations Dashboard for planning, executing, and auditing structured adversarial testing of AI systems across established risk frameworks.

ARES Dashboard is an enterprise-oriented AI red team operations console designed to help security teams, AI safety researchers, and governance programs conduct structured, repeatable, and auditable adversarial testing of AI systems.

ARES provides a centralized workspace for building attack manifests, managing red team campaigns, aligning assessments with recognized frameworks such as OWASP LLM Top 10 and MITRE, and exporting evidence for review and compliance workflows.

The system supports role-based access control, audit logging, persistent campaign storage, and optional AI-assisted scenario generation. A built-in demo mode allows full exploration of core functionality without requiring external API keys.

ARES is designed to serve as the operational execution layer within a broader AI safety and governance ecosystem, enabling disciplined red teaming without automating exploitation or removing human oversight.

GitHub - Arnoldlarry15/ARES-Dashboard: AI Red Team Operations Console

AI Red Team Operations Console. Contribute to Arnoldlarry15/ARES-Dashboard development by creating an account on GitHub.

594 viewsedited 14:43

SOC Simulation (ASS): Next-Gen Autonomous Security Operations Center

https://github.com/SafelineMan/Agentic-SOC-Simulation

GitHub - SafelineMan/Agentic-SOC-Simulation: AI 驱动的 SOC 仿真平台

AI 驱动的 SOC 仿真平台. Contribute to SafelineMan/Agentic-SOC-Simulation development by creating an account on GitHub.

289 views08:27

AI Security Tools — December 2025

AI Security Tools - December 2025

Open-source AI security repositories published in December 2025.

https://medium.com/ai-security-hub/ai-security-tools-december-2025-e8820fe2b883

Medium

Open-source AI security repositories published in December 2025.

🔥1

266 views11:43

DeepAudit - Your AI Security Audit Team, Making Vulnerability Discovery Accessible - https://github.com/lintsinghua/DeepAudit/blob/v3.0.0/README_EN.md

DeepAudit/README_EN.md at v3.0.0 · lintsinghua/DeepAudit

DeepAudit：人人拥有的 AI 黑客战队，让漏洞挖掘触手可及。国内首个开源的代码漏洞挖掘多智能体系统。小白一键部署运行，自主协作审计 + 自动化沙箱 PoC 验证。支持 Ollama 私有部署，一键生成报告。支持中转站。让安全不再昂贵，让审计不再复杂。 - lintsinghua/DeepAudit

🔥2

275 views12:12

Link to join AISecHub Telegram channel https://t.me/AISecHub

https://linktr.ee/aisechub managed by AISecHub. Sponsored by: innovguard.com

231 views06:51

The Arcanum Prompt Injection Taxonomy v1.5

The Arcanum Prompt Injection Taxonomy v1.5 is a new, open-source, interactive classification of prompt injection attacks against large language models. It organizes the attack surface into four parts Attack Intents, Techniques, Evasions, and Inputs with detailed descriptions and real examples to help security teams understand, test, and defend against prompt injection threats.

https://arcanum-sec.github.io/arc_pi_taxonomy/

295 viewsedited 07:10

Upcoming AI Security Conferences

📅 NHIcon 2026 - The Rise of Agentic AI Security – January 27, 2026 | Virtual | Aembit | https://aembit.io/nhicon/

📅 DiCyFor & AI Security Summit - Singapore 2026 - February 10, 2026 | Singapore | DiCyFor | https://www.dicyfor.com/singapore2026

📅 Après-Cyber Slopes Summit - AI + Cybersecurity – February 25–27, 2026 | Park City, UT, USA | Après-Cyber | https://www.aprescyber.com/

📅 IEEE SaTML 2026 - IEEE Conference on Secure and Trustworthy Machine Learning – March 2026 | Munich, Germany | @satml_conf | https://satml.org/

📅 [un]prompted – The AI Security Practitioner Conference – March 3–4, 2026 | Salesforce Tower, San Francisco, CA, USA | [un]prompted | https://unpromptedcon.org/

📅 AI Security Summit 2026 – March 10, 2026 | Tel Aviv, Israel | Lynx Events | https://events.lynx.co/ai-security-summit/

📅 DiCyFor & AI Security Summit – Bangkok 2026 - March 11, 2026 | Bangkok, Thailand | DiCyFor | https://www.dicyfor.com/bangkok2026

📅 DiCyFor & AI Security Summit – Manila 2026 - March 31, 2026 | Manila, Philippines | DiCyFor | https://www.dicyfor.com/manila2026 (DTR Society)

📅 CyberGenAI’2026 - International Conference on Cybersecurity & Generative AI - March 13, 2026 | Chennai, India (Hybrid) | SRMIST Ramapuram | https://fsh.srmrmp.edu.in/department-of-computer-science-applications/cybergenai2026/

📅 DiCyFor & AI Security Summit – Kuala Lumpur 2026 – April 15, 2026 | Kuala Lumpur, Malaysia | DiCyFor | https://www.dicyfor.com/kualalumpur2026

📅 SANS AI Cybersecurity Summit 2026 – Summit: April 20–21, 2026 | Arlington, VA, USA & Virtual | SANS Institute | https://www.sans.org/cyber-security-training-events/ai-summit-2026 | @SANSInstitute

📅 teissAmsterdam2026 - AI Threats: Detect and Protect – April 22, 2026 | Amsterdam, Netherlands | teiss | https://www.teiss.co.uk/events/teissamsterdam2026

📅 AI Security Summit @ Black Hat Asia 2026 - April 22, 2026 | Singapore | @BlackHatEvents | https://www.blackhat.com/asia-26/ai-security-summit.html

267 views10:04

Full list: https://github.com/TalEliyahu/Awesome-AI-Security/blob/main/Conferences.md

265 viewsedited 10:32

Top 10 AI Security 2025 Thought Leaders

Top 10 AI Security 2025 Thought Leaders
https://medium.com/ai-security-hub/top-10-ai-security-2025-thought-leaders-e15d3c5324cc

Medium

These are people I keep coming back to because their work holds up. Follow them if you want depth, not noise.

276 views14:17