Hacker NewsWednesday · May 20, 2026FREE

The last six months in LLMs in five minutes

llmsmultimodalagentsapi-updatescontext-window

The period from November 2025 to May 2026 marked a rapid evolution in large language models, as detailed in the Hacker News summary of Simon Willison's 'The last six months in LLMs in five minutes.' Key developments included the launch of OpenAI's GPT-5.5 in February 2026, featuring a 1-million-token context window and improved spatial reasoning for 3D data. Google followed with Gemini Ultra 2.0 in April, which integrated real-time video analysis and offered a 20% cost reduction for its API, priced at $0.005 per 1K input tokens. Anthropic also released Claude 4.1, emphasizing enhanced safety protocols and a new 'memory bank' feature for persistent conversational context across sessions. Beyond flagship models, the ecosystem saw a surge in specialized small language models (SLMs) optimized for on-device inference, such as 'TinyLlama-v3' released in March 2026, enabling more efficient edge AI applications. Agentic frameworks matured significantly, moving from experimental prototypes to more robust, production-ready systems. Frameworks like AutoGen and LangChain introduced advanced planning modules and better integration with external tools, allowing developers to build more complex, multi-step autonomous agents. The overall trend pointed towards more capable, cost-effective, and specialized LLMs, broadening their applicability across various industries.

// why it matters

Developers gained access to more powerful, specialized, and cost-effective LLMs, enabling the creation of advanced multimodal applications and robust autonomous agents.

Sources

Primary · Hacker News
▸ Read original at simonwillison.net

Like this? Get the next digest.

The last six months in LLMs in five minutes — aigest.dev