Estonia's government-sponsored Estonian Language Institute has published a benchmark ranking dozens of LLMs on resistance to Russian strategic narratives, covering 14 propaganda categories including Crimea's status, justifications for the Ukraine war, NATO history, and Russia's WWII annexation of the Baltic states.

The methodology matters here. ELI and volunteer defense collective Propastop tested models with three question types: neutral, false-assumption-laden, and explicitly manipulative. Models were queried in English, Estonian, and Russian, then scored by a separate AI model calibrated against Propastop human experts, with no web search or external tools allowed.

The full benchmark is public and the category breakdown is worth examining closely. The 14 narrative categories reveal exactly which geopolitical pressure points concern Estonia most, and the trilingual testing exposes whether models behave differently depending on the language of the prompt, a gap most LLM safety evaluations ignore entirely.

[READ ORIGINAL →]