Artificial Intelligence
29 subscribers
66 photos
5 videos
3 files
247 links
Download Telegram
British scientist Stephen Hawking and Iranian-American NASA scientist Firouz Naderi could use the BCI at its mature phase.
Yann LeCun, a prominent figure in AI and deep learning, has expressed skepticism and criticism toward Reinforcement Learning (RL)—especially in the context of general intelligence or autonomous agents. His main concerns center around efficiency, scalability, and biological plausibility. Here's a breakdown of why LeCun doesn't favor RL as a general learning paradigm:

🔹 1. Sample Inefficiency
LeCun argues that RL is extremely sample-inefficient. Modern RL algorithms often require millions of interactions with an environment to learn even relatively simple tasks—something humans and animals can learn in just a few trials.
“It’s the most inefficient way of learning anything that has ever been invented by humans.” — Yann LeCun

This makes RL impractical for real-world scenarios where data is expensive or interactions are limited.

🔹 2. Not How the Brain Works
LeCun believes that human and animal learning is not primarily driven by reinforcement signals (i.e., rewards or punishments). Instead, he argues the brain relies much more on self-supervised learning—predicting sensory inputs, learning representations, and modeling the world.
He often draws analogies to the brain’s cortex (handling perception and prediction) versus the basal ganglia (handling rewards and actions). In his view:
The cortex (self-supervised learning) does most of the work.
The basal ganglia (RL) is just a small part.

🔹 3. Reward Engineering is Hard
Designing a proper reward function in RL is non-trivial and error-prone. LeCun sees this as a major limitation for applying RL to complex real-world problems.
Badly shaped rewards can lead to unintended behavior, reward hacking, or failure to generalize.

🔹 4. Doesn’t Scale to Complex Tasks
RL has trouble generalizing and scaling to tasks that:
Have long time horizons
Require planning
Involve abstract reasoning
LeCun suggests that more modular, hierarchical, and model-based approaches—particularly self-supervised learning combined with world modeling—are more scalable.

🔹 5. Better Alternatives Exist
LeCun strongly advocates for self-supervised learning (SSL) as the future of AI. He sees SSL as:
- More biologically plausible
- More efficient
- More generalizable

He’s also promoting architectures like the Joint Embedding Predictive Architecture (JEPA) and energy-based models that learn by predicting and modeling the world, rather than reacting to rewards.

🔹 6. RL Is Useful, But Narrow
To be clear, LeCun doesn’t say RL is useless. He acknowledges it’s very useful in specific domains, like:
Games (e.g., AlphaGo, Atari)
Robotics (with careful engineering)
Bandits and decision-making under uncertainty, but he argues RL should be a narrow tool, not the foundation of general intelligence.

🧠 Summary of LeCun’s View:

“You don’t learn to drive by getting a reward every time you stay on the road. You learn by predicting what happens when you turn the wheel or hit the brakes.”


In his vision for autonomous intelligence, world modeling, self-supervised learning, and planning play the central role—not reward maximization.