Day 7 - Dense Embedding - RAG
The article discusses dense embeddings, which are continuous numeric vectors (e.g., [0.3455566, 0.6777779]) representing text chunks in latent space, as opposed to sparse embeddings that use discrete values like 0 and focus on word frequency. Models for generating dense embeddings include dedicated embedding LLMs (e.g., Nomic embed, BGE) and transformer encoders available on Hugging Face and Ollama. Using general-purpose LLMs for embeddings is costly. The article also covers evaluating RAG system performance: for a given query, the system returns a set of documents; if the returned set matches expectations (e.g., documents a, b, c, d, e), performance is good. If it returns only a, b, d, performance is lacking. This evaluation depends on the quality of embeddings and retrieval.
Choosing the right embedding model directly impacts RAG retrieval accuracy.