Offshore

1 view04:28

memenodes
locking in and seeing no results



Apart from breakup, what else can make a man be like this?  
https://t.co/fEOVkIpgF8

- naiive
tweet

1 view04:28

Offshore

1 view04:49

Offshore

Photo

Brady Long
If you’re building agents, this matters more than another bigger model launch.

MiroThinker 1.5 is about agentic density: better reasoning per parameter, lower cost, and more controllable behavior.

Explore: https://t.co/mlOCZjARcI

https://t.co/q3P6xXXYhW



We just flipped the scaling narrative: Agentic Density > Parameter Count. 

 #MiroThinker 1.5 operationalizes Interactive Scaling—agents that seek evidence, iterate, and revise in real time (with a time-sensitive sandbox to avoid hindsight leakage). 

Result: 30B hitting frontier-class agentic search, ~$0.07/query (≈20× cheaper vs 1T-class baselines). 

 Fully open source, read more: https://t.co/m4HzxiidRX  
Try: https://t.co/dwKmu3O9t7  
GH:https://t.co/u4VhL8o8Gt 
HF: https://t.co/ClqKRrQn6R

1 view04:49

1 view05:49

memenodes
international law:

Until it's done, tell no one https://t.co/hlwQjJdkPP
tweet

1 view05:49

Offshore

1 view05:49

Offshore

Photo

memenodes
Me going to another city because I'm too shy to ask the driver to stop https://t.co/ArDJQRHlU1
tweet

1 view05:49

Offshore

1 view05:49

Offshore

Photo

memenodes
Me when I say “Internet capital markets” instead of meme coins https://t.co/BFN5fYk1Gd
tweet

1 view05:49

Offshore

1 view08:36

Offshore

Photo

God of Prompt
RT @rryssf_: This paper from BMW Group and Korea’s top research institute exposes a blind spot almost every enterprise using LLMs is walking straight into.

We keep talking about “alignment” like it’s a universal safety switch.

It isn’t.

The paper introduces COMPASS, a framework that shows why most AI systems fail not because they’re unsafe, but because they’re misaligned with the organization deploying them.

Here’s the core insight.

LLMs are usually evaluated against generic policies: platform safety rules, abstract ethics guidelines, or benchmark-style refusals.

But real companies don’t run on generic rules.

They run on internal policies:

- compliance manuals
- operational playbooks
- escalation procedures
- legal edge cases
- brand-specific constraints

And these rules are messy, overlapping, conditional, and full of exceptions.

COMPASS is built to test whether a model can actually operate inside that mess.

Not whether it knows policy language, but whether it can apply the right policy, in the right context, for the right reason.

The framework evaluates models on four things that typical benchmarks ignore:

1. policy selection: When multiple internal policies exist, can the model identify which one applies to this situation?

2. policy interpretation: Can it reason through conditionals, exceptions, and vague clauses instead of defaulting to overly safe or overly permissive behavior?

3. conflict resolution: When two rules collide, does the model resolve the conflict the way the organization intends, not the way a generic safety heuristic would?

4. justification: Can the model explain its decision by grounding it in the policy text, rather than producing a confident but untraceable answer?

One of the most important findings is subtle and uncomfortable:

Most failures were not knowledge failures.

They were reasoning failures.

Models often had access to the correct policy but:

- applied the wrong section
- ignored conditional constraints
- overgeneralized prohibitions
- or defaulted to conservative answers that violated operational goals

From the outside, these responses look “safe.”

From the inside, they’re wrong.

This explains why LLMs pass public benchmarks yet break in real deployments.

They’re aligned to nobody in particular.

The paper’s deeper implication is strategic.

There is no such thing as “aligned once, aligned everywhere.”

A model aligned for an automaker, a bank, a hospital, and a government agency is not one model with different prompts.

It’s four different alignment problems.

COMPASS doesn’t try to fix alignment.

It does something more important for enterprises:
it makes misalignment measurable.

And once misalignment is measurable, it becomes an engineering problem instead of a philosophical one.

That’s the shift this paper quietly pushes.

Alignment isn’t about being safe in the abstract.

It’s about being correct inside a specific organization’s rules.

And until we evaluate that directly, most “production-ready” AI systems are just well-dressed liabilities.
tweet

1 view08:36

Offshore

1 view08:56

Offshore

Photo

God of Prompt
RT @godofprompt: 🚨 DeepMind discovered that neural networks can train for thousands of epochs without learning anything.

Then suddenly, in a single epoch, they generalize perfectly.

This phenomenon is called "Grokking".

It went from a weird training glitch to a core theory of how models actually learn.

Here’s what changed (and why this matters now):
tweet

1 view08:56

Offshore

1 view09:31

Offshore

Photo

Brady Long
🚨BREAKING: ChatGPT can now edit and create videos for free.

You don’t need fancy software anymore.

Here’s how to do it (in 3 simple steps) 👇 https://t.co/TtEqn0e0YY
tweet

1 view09:31

Offshore

1 view10:32

Offshore

Photo

God of Prompt
RT @godofprompt: Turn your ChatGPT into a 200-IQ reasoning machine by adding these settings to your custom instructions: https://t.co/ID437ipyVg
tweet

1 view10:32

Offshore

1 view10:32

Offshore

Photo

God of Prompt
The $270 billion shift nobody’s talking about.

AI is moving OUT of the cloud and INTO your devices.

27 billion edge devices are already processing AI locally - faster than any cloud server, with zero latency, completely offline.

Here’s the research that proves it (and what it means for you):
tweet

1 view10:32

About

Blog

Apps

Platform