Why AWS scrapped OpenSearch’s architecture to chase agent workloads
AWS on Thursday announced a significant architectural overhaul of OpenSearch Serverless, moving away from its original shared storage model to a decoupled compute and storage design. This rebuild targets agent workloads, particularly vector search and retrieval-augmented generation (RAG) pipelines, which require low-latency and high-throughput indexing. The new architecture separates indexing and query compute from storage, enabling independent scaling and better performance for AI-driven applications. AWS claims this change reduces query latency by up to 40% and improves indexing throughput by 2x. The update is available immediately in all AWS regions where OpenSearch Serverless is offered, with no additional cost for the architectural change itself. However, customers may see changes in pricing based on compute and storage usage. The move positions OpenSearch to compete more directly with specialized vector databases like Pinecone and Weaviate, as enterprises increasingly adopt agent-based architectures.
Developers building AI agents get faster, scalable vector search without managing infrastructure.