Sequoia is co-leading the first funding round for Ineffable Intelligence, a London AI research lab founded by David Silver with a stated goal of building superintelligence through pure reinforcement learning. No pre-training. No human data. The system, which Silver calls a superlearner, learns entirely from consequences of its own actions in a constructed environment.
Silver's track record makes this bet harder to dismiss than it sounds. At DeepMind, he led AlphaGo, which defeated Lee Sedol in March 2016, and then AlphaGo Zero, which removed human pre-training entirely and pushed the system's ELO rating from roughly 3,700 to over 5,000. Go has approximately 10^170 legal board positions, more than the number of atoms in the observable universe. Self-play solved it. The original piece walks through exactly how that leap happened and why Silver believes the same approach scales beyond games.
The contrarian case here is specific: LLM-based systems trained on human data may have hard ceilings baked in by that data. Ineffable is testing whether an agent bootstrapped from zero can rediscover language, mathematics, and physics from first principles, and then go further. The timeline is uncertain and the thesis is genuinely unproven at scale. That tension is worth reading the full piece to understand.
[READ ORIGINAL →]