Embedding models convert text to dense vectors for semantic search, RAG retrieval, classification, and clustering. The choice of embedding model affects retrieval quality, cost, storage size, and indexing speed — and it's one of the least-discussed decisions in RAG pipeline architecture despite having significant downstream impact.
The MTEB (Massive Text Embedding Benchmark) is the standard evaluation suite for embedding models, covering 56 tasks across 8 categories. As of mid-2026, the top performers on the English MTEB leaderboard are proprietary models: Voyage AI's voyage-3-large (67.8 average score), OpenAI's text-embedding-3-large (64.6), and Cohere's embed-v4.0 (64.3). For open-source models, GTE-Qwen2-7B-instruct (67.4) and mxbai-embed-large-v1 (64.7) are competitive with the proprietary options at zero API cost.
OpenAI text-embedding-3-large produces 3072-dimensional vectors at $0.13/million tokens. text-embedding-3-small produces 1536-dimensional vectors at $0.02/million tokens with roughly 80% of the quality. For most RAG applications, the small model is sufficient — the quality gap narrows significantly once you've addressed chunking strategy and retrieval logic.
Voyage AI's voyage-3 and voyage-code-3 are purpose-built for code and technical content retrieval. voyage-code-3 outperforms general embedding models on code search by a significant margin — relevant for tools like GitIntel that work with source code. Cost is $0.06/million tokens.
For open-source deployment: mxbai-embed-large-v1 (335M parameters, 1024 dimensions) runs on a CPU-only server and delivers performance comparable to text-embedding-3-small. Download once via HuggingFace, run locally indefinitely with zero API cost. Suitable for privacy-sensitive workloads or high-volume applications where API costs become prohibitive.
Vector dimension matters for storage. 1536-dimensional float32 vectors: 6KB per embedding. At 1 million documents, that's 6GB of vector storage. 256-dimensional matryoshka embeddings (supported by text-embedding-3-* models) can reduce this 6x with modest quality loss — useful when storage cost is a constraint.
Practical recommendation: start with text-embedding-3-small for prototyping (cheap, fast, widely supported). Move to voyage-3 or a local model if retrieval quality is insufficient. Use voyage-code-3 or mxbai-embed-large for code retrieval tasks.