AISecHub
1.49K subscribers
555 photos
36 videos
254 files
1.42K links
https://linktr.ee/aisechub managed by AISecHub. Sponsored by: innovguard.com
Download Telegram
Implementing Secure AI
Framework Controls in Google Cloud - New Version

https://services.google.com/fh/files/misc/ociso_2025_saif_cloud_paper.pdf
Skynet Starter Kit - https://media.ccc.de/v/39c3-skynet-starter-kit-from-embodied-ai-jailbreak-to-remote-takeover-of-humanoid-robots#t=26 - CCC

From Embodied AI Jailbreak to Remote Takeover of Humanoid Robots

We present a comprehensive security assessment of Unitree's robotic ecosystem. We identified and exploited multiple security flaws across multiple communication channels, including Bluetooth, LoRa radio, WebRTC, and cloud management services. Besides pwning multiple traditional binary or web vulnerabilities, we also exploit the embodied AI agent in the robots, performing prompt injection and achieve root-level remote code execution. Furthermore, we leverage a flaw in cloud management services to take over any Unitree G1 robot connected to the Internet. By deobfuscating and patching the customized, VM-based obfuscated binaries, we successfully unlocked forbidden robotic movements restricted by the vendor firmware on consumer models such as the G1 AIR. We hope our findings could offer a roadmap for manufacturers to strengthen robotic designs, while arming researchers and consumers with critical knowledge to assess security in next-generation robotic systems.
AI-generated content in Wikipedia - https://www.youtube.com/watch?v=fKU0V9hQMnY by @presroi

I successfully failed with a literature related project and accidentally built a ChatGPT detector. Then I spoke to the people who uploaded ChatGPT generated content on Wikipedia.

It began as a standard maintenance project: I wanted to write a tool to find and fix broken ISBN references in Wikipedia. Using the built-in checksum, this seemed like a straightforward technical task. I expected to find mostly typos. But I also found texts generated by LLMs. These models are effective at creating plausible-sounding content, but (for now) they often fail to generate correct checksums for identifiers like ISBNs.

This vulnerability turned my tool into an unintentional detector for this type of content. This talk is the story of that investigation. I'll show how the tool works and how it identifies this anti-knowledge. But the tech is only half the story. The other half is human. I contacted the editors who had added this undeclared AI content. I will talk about why they did it and how the Wikipedians reacted and whether "The End is Nigh" calls might be warranted.
AI Agent, AI Spy - https://media.ccc.de/v/39c3-ai-agent-ai-spy#t=246

Agentic AI is the catch-all term for AI-enabled systems that propose to complete more or less complex tasks on their own, without stopping to ask permission or consent. What could go wrong?

These systems are being integrated directly into operating systems and applications, like web browsers. This move represents a fundamental paradigm shift, transforming them from relatively neutral resource managers into an active, goal-oriented infrastructure ultimately controlled by the companies that develop these systems, not by users or application developers.

Systems like Microsoft's "Recall," which create a comprehensive "photographic memory" of all user activity, are marketed as productivity enhancers, but they function as OS-level surveillance and create significant privacy vulnerabilities. In the case of Recall, we’re talking about a centralized, high-value target for attackers that poses an existential threat to the privacy guarantees of meticulously engineered applications like Signal. This shift also fundamentally undermines personal agency, replacing individual choice and discovery with automated, opaque recommendations that can obscure commercial interests and erode individual autonomy.

This talk will review the immediate and serious danger that the rush to shove agents into our devices and digital lives poses to our fundamental right to privacy and our capacity for genuine personal agency. Drawing from Signal's analysis, it moves beyond outlining the problem to also present a "tourniquet" solution: looking at what we need to do *now* to ensure that privacy at the application layer isn’t eliminated, and what the hacker community can do to help. We will outline a path for ensuring developer agency, granular user control, radical transparency, and the role of adversarial research.
Data that is too dirty for "AI" - https://media.ccc.de/v/39c3-a-media-almost-archaeology-on-data-that-is-too-dirty-for-ai#t=202 by #jiawenuffline

In 1980s, non-white women’s body size data was categorized as dirty data when establishing the first women's sizing system in US. Now in the age of GPT, what is considered as dirty data and how are they removed from massive training materials?

Datasets nowadays for training large models have been expanded to the volume of (partial) internet, with the idea of “scale averages out noise”, these datasets were scaled up by scrabbling whatever available data on the internet for free then “cleaned” with a human-not-in-the-loop, cheaper-than-cheap-labor method: heuristic filtering. Heuristics in this context are basically a set of rules came up by the engineers with their imagination and estimation that are “good enough” to remove “dirty data” of their perspective, not guaranteed to be optimal, perfect, or rational.

The talk will show some intriguing patterns of “dirty data” from 23 extraction-based datasets, like how NSFW gradually equals to NSFTM (not safe for training model), and reflect on these silent, anonymous yet upheld estimations and not-guaranteed rationalities in current sociotechnical artifacts, and ask for whom these estimations are good-enough, as it will soon be part our technological infrastructures.
Breaking BOTS: Cheating at Blue Team CTFs with AI Speed-Runs - CCC
https://media.ccc.de/v/39c3-breaking-bots-cheating-at-blue-team-ctfs-with-ai-speed-runs#t=93

After we announced our results, CTFs like Splunk's Boss of the SOC (BOTS) started prohibiting AI agents. For science & profit, we keep doing it anyways. In BOTS, the AIs solve most of it in under 10 minutes instead of taking the full day. Our recipe was surprisingly simple: Teach AI agents to self-plan their investigation steps, adapt their plans to new information, work with the SIEM DB, and reason about log dumps. No exotic models, no massive lab budgets - just publicly available LLMs mixed with a bit of science and perseverance. We'll walk through how that works, including videos of the many ways AI trips itself up that marketers would rather hide, and how to do it at home with free and open-source tools.

CTF organizers can't detect this - the arms race is probably over before it really began. But the real question isn't "can we cheat at CTFs?" It's what happens when investigations evolve from analysts-who-investigate to analysts-who-manage-AI-investigators. We'll show you what that transition already looks like today and peek into some uncomfortable questions about what comes next.
VulnLLM-R: Specialized Reasoning LLM for Vulnerability Detection - https://github.com/ucsb-mlsec/VulnLLM-R

Through extensive experiments on SOTA datasets across Python, C/C++, and Java, we show that VulnLLM-R has superior effectiveness and efficiency than SOTA static analysis tools and both open-source and commercial large reasoning models.

We further conduct a detailed ablation study to validate the key designs in our training recipe. Finally, we construct an agent scaffold around our model and show that it outperforms CodeQL and AFL++ in realworld projects. Our agent further discovers a set of zero-day vulnerabilities in actively maintained repositories. This work represents a pioneering effort to enable real-world, project-level vulnerability detection using AI agents powered by specialized reasoning models.

More info: https://arxiv.org/pdf/2512.07533
Cupcake - Make AI agents follow the rules - https://github.com/eqtylab/cupcake

Cupcake intercepts agent events and evaluates them against user-defined rules written in Open Policy Agent (OPA) Rego.

Agent actions can be blocked, modified, and auto-corrected by providing the agent helpful feedback. Additional benefits include reactive automation for tasks you dont need to rely on the agent to conduct (like linting after a file edit).

Why Cupcake?

Modern agents are powerful but inconsistent at following operational and security rules, especially as context grows. Cupcake turns the rules you already maintain (e.g., http://CLAUDE.md, http://AGENT.md, .cursor/rules) into enforceable guardrails that run before actions execute.

- Multi-harness support with first‑class integrations for Claude Code, Cursor, Factory AI, and OpenCode.

- Governance‑as‑code using OPA/Rego compiled to WebAssembly for fast, sandboxed evaluation.

- Enterprise‑ready controls: allow/deny/review, enriched audit trails for AI SOCs, and proactive warnings.
ARES-Dashboard - AI Red Team Operations Console https://github.com/Arnoldlarry15/ARES-Dashboard

Demo: https://ares-dashboard-mauve.vercel.app/

ARES is an AI Red Team Operations Dashboard for planning, executing, and auditing structured adversarial testing of AI systems across established risk frameworks.

ARES Dashboard is an enterprise-oriented AI red team operations console designed to help security teams, AI safety researchers, and governance programs conduct structured, repeatable, and auditable adversarial testing of AI systems.

ARES provides a centralized workspace for building attack manifests, managing red team campaigns, aligning assessments with recognized frameworks such as OWASP LLM Top 10 and MITRE, and exporting evidence for review and compliance workflows.

The system supports role-based access control, audit logging, persistent campaign storage, and optional AI-assisted scenario generation. A built-in demo mode allows full exploration of core functionality without requiring external API keys.

ARES is designed to serve as the operational execution layer within a broader AI safety and governance ecosystem, enabling disciplined red teaming without automating exploitation or removing human oversight.
The Arcanum Prompt Injection Taxonomy v1.5

The Arcanum Prompt Injection Taxonomy v1.5 is a new, open-source, interactive classification of prompt injection attacks against large language models. It organizes the attack surface into four parts Attack Intents, Techniques, Evasions, and Inputs with detailed descriptions and real examples to help security teams understand, test, and defend against prompt injection threats.


https://arcanum-sec.github.io/arc_pi_taxonomy/
Upcoming AI Security Conferences

📅 NHIcon 2026 - The Rise of Agentic AI Security – January 27, 2026 | Virtual | Aembit | https://aembit.io/nhicon/

📅 DiCyFor & AI Security Summit - Singapore 2026 - February 10, 2026 | Singapore | DiCyFor | https://www.dicyfor.com/singapore2026

📅 Après-Cyber Slopes Summit - AI + Cybersecurity – February 25–27, 2026 | Park City, UT, USA | Après-Cyber | https://www.aprescyber.com/

📅 IEEE SaTML 2026 - IEEE Conference on Secure and Trustworthy Machine Learning – March 2026 | Munich, Germany | @satml_conf | https://satml.org/

📅 [un]prompted – The AI Security Practitioner Conference – March 3–4, 2026 | Salesforce Tower, San Francisco, CA, USA | [un]prompted | https://unpromptedcon.org/

📅 AI Security Summit 2026 – March 10, 2026 | Tel Aviv, Israel | Lynx Events | https://events.lynx.co/ai-security-summit/

📅 DiCyFor & AI Security Summit – Bangkok 2026 - March 11, 2026 | Bangkok, Thailand | DiCyFor | https://www.dicyfor.com/bangkok2026

📅 DiCyFor & AI Security Summit – Manila 2026 - March 31, 2026 | Manila, Philippines | DiCyFor | https://www.dicyfor.com/manila2026 (DTR Society)

📅 CyberGenAI’2026 - International Conference on Cybersecurity & Generative AI - March 13, 2026 | Chennai, India (Hybrid) | SRMIST Ramapuram | https://fsh.srmrmp.edu.in/department-of-computer-science-applications/cybergenai2026/

📅 DiCyFor & AI Security Summit – Kuala Lumpur 2026 – April 15, 2026 | Kuala Lumpur, Malaysia | DiCyFor | https://www.dicyfor.com/kualalumpur2026

📅 SANS AI Cybersecurity Summit 2026 – Summit: April 20–21, 2026 | Arlington, VA, USA & Virtual | SANS Institute | https://www.sans.org/cyber-security-training-events/ai-summit-2026 | @SANSInstitute

📅 teissAmsterdam2026 - AI Threats: Detect and Protect – April 22, 2026 | Amsterdam, Netherlands | teiss | https://www.teiss.co.uk/events/teissamsterdam2026

📅 AI Security Summit @ Black Hat Asia 2026 - April 22, 2026 | Singapore | @BlackHatEvents | https://www.blackhat.com/asia-26/ai-security-summit.html
Awesome Cyber Security Newsletters

Periodic cyber security newsletters that capture the latest news, summaries of conference talks, research, best practices, tools, events, vulnerabilities, and analysis of trending threats and attacks

https://github.com/TalEliyahu/awesome-security-newsletters

1200 Stars ⭐️⭐️⭐️⭐️