Offshore – Telegram

Offshore

7 subscribers

32.6K photos

5.58K videos

2 files

44.1K links

https://x.com/i/lists/1669153613199835138?t=R0mCicxs7zfJE_yOAek4gQ&s=09

Download Telegram

About

Blog

Apps

Platform

God of Prompt
RT @godofprompt: 🚨 DeepMind discovered that neural networks can train for thousands of epochs without learning anything.

Then suddenly, in a single epoch, they generalize perfectly.

This phenomenon is called "Grokking".

It went from a weird training glitch to a core theory of how models actually learn.

Here’s what changed (and why this matters now):
tweet

1 view02:07

1 view03:38

memenodes
RT @Bqmbulu: Loving a girl more than she loves you: https://t.co/gmD2gP0Wxk
tweet

1 view03:38

memenodes
Porn addiction is so crazy like how you addicted to other nig*as getting pussy?



STOP WATCHING PORN!!

STOP WATCHING PORN!!!

STOP WATCHING PORN!!

YES YOU!!👀👀..STOP IT!!

STOP WATCHING PORN!!

X (formerly Twitter)

m (@skitzocat) on X

STOP WATCHING PORN!!

STOP WATCHING PORN!!!

STOP WATCHING PORN!!

YES YOU!!👀👀..STOP IT!!

STOP WATCHING PORN!!

1 view04:08

memenodes
RT @Mona_Trades: You don’t suck at trading

You suck at waiting
tweet

1 view04:08

This media is not supported in your browser

VIEW IN TELEGRAM

1 view04:28

memenodes
locking in and seeing no results



Apart from breakup, what else can make a man be like this?  
https://t.co/fEOVkIpgF8

1 view04:28

1 view04:49

Brady Long
If you’re building agents, this matters more than another bigger model launch.

MiroThinker 1.5 is about agentic density: better reasoning per parameter, lower cost, and more controllable behavior.

Explore: https://t.co/mlOCZjARcI

https://t.co/q3P6xXXYhW



We just flipped the scaling narrative: Agentic Density > Parameter Count. 

 #MiroThinker 1.5 operationalizes Interactive Scaling—agents that seek evidence, iterate, and revise in real time (with a time-sensitive sandbox to avoid hindsight leakage). 

Result: 30B hitting frontier-class agentic search, ~$0.07/query (≈20× cheaper vs 1T-class baselines). 

 Fully open source, read more: https://t.co/m4HzxiidRX  
Try: https://t.co/dwKmu3O9t7  
GH:https://t.co/u4VhL8o8Gt 
HF: https://t.co/ClqKRrQn6R

- MiroMindAI
tweet

1 view04:49

1 view05:49

memenodes
international law:

Until it's done, tell no one https://t.co/hlwQjJdkPP
tweet

1 view05:49

1 view05:49

memenodes
Me going to another city because I'm too shy to ask the driver to stop https://t.co/ArDJQRHlU1
tweet

1 view05:49

1 view05:49

memenodes
Me when I say “Internet capital markets” instead of meme coins https://t.co/BFN5fYk1Gd
tweet

1 view05:49

1 view08:36

God of Prompt
RT @rryssf_: This paper from BMW Group and Korea’s top research institute exposes a blind spot almost every enterprise using LLMs is walking straight into.

We keep talking about “alignment” like it’s a universal safety switch.

It isn’t.

The paper introduces COMPASS, a framework that shows why most AI systems fail not because they’re unsafe, but because they’re misaligned with the organization deploying them.

Here’s the core insight.

LLMs are usually evaluated against generic policies: platform safety rules, abstract ethics guidelines, or benchmark-style refusals.

But real companies don’t run on generic rules.

They run on internal policies:

- compliance manuals
- operational playbooks
- escalation procedures
- legal edge cases
- brand-specific constraints

And these rules are messy, overlapping, conditional, and full of exceptions.

COMPASS is built to test whether a model can actually operate inside that mess.

Not whether it knows policy language, but whether it can apply the right policy, in the right context, for the right reason.

The framework evaluates models on four things that typical benchmarks ignore:

1. policy selection: When multiple internal policies exist, can the model identify which one applies to this situation?

2. policy interpretation: Can it reason through conditionals, exceptions, and vague clauses instead of defaulting to overly safe or overly permissive behavior?

3. conflict resolution: When two rules collide, does the model resolve the conflict the way the organization intends, not the way a generic safety heuristic would?

4. justification: Can the model explain its decision by grounding it in the policy text, rather than producing a confident but untraceable answer?

One of the most important findings is subtle and uncomfortable:

Most failures were not knowledge failures.

They were reasoning failures.

Models often had access to the correct policy but:

- applied the wrong section
- ignored conditional constraints
- overgeneralized prohibitions
- or defaulted to conservative answers that violated operational goals

From the outside, these responses look “safe.”

From the inside, they’re wrong.

This explains why LLMs pass public benchmarks yet break in real deployments.

They’re aligned to nobody in particular.

The paper’s deeper implication is strategic.

There is no such thing as “aligned once, aligned everywhere.”

A model aligned for an automaker, a bank, a hospital, and a government agency is not one model with different prompts.

It’s four different alignment problems.

COMPASS doesn’t try to fix alignment.

It does something more important for enterprises:
it makes misalignment measurable.

And once misalignment is measurable, it becomes an engineering problem instead of a philosophical one.

That’s the shift this paper quietly pushes.

Alignment isn’t about being safe in the abstract.

It’s about being correct inside a specific organization’s rules.

And until we evaluate that directly, most “production-ready” AI systems are just well-dressed liabilities.
tweet

1 view08:36

1 view08:56

God of Prompt
RT @godofprompt: 🚨 DeepMind discovered that neural networks can train for thousands of epochs without learning anything.

Then suddenly, in a single epoch, they generalize perfectly.

This phenomenon is called "Grokking".

It went from a weird training glitch to a core theory of how models actually learn.

Here’s what changed (and why this matters now):
tweet

1 view08:56

1 view09:31