Embedding
A fixed-length vector of floating-point numbers that represents the semantic meaning of a piece of text, image, or other input. Used for similarity search, clustering, and retrieval. Common embedding models output 768 to 3072 dimensions.
Background
Embeddings convert input data into points in a high-dimensional vector space where semantically similar items are nearby. Cosine similarity between two embeddings approximates how related the underlying texts are. In coding tools, embeddings power semantic codebase search — query "where is the rate limiter?" and retrieve files even if they don't contain the literal words. OpenAI text-embedding-3, Cohere embed-v3, and Voyage's voyage-3 are the most common providers in 2026.
Tools that use it
- 01→Sourcegraph Cody
Code-search and AI assistant by Sourcegraph — semantic search across enterprise codebases plus inline coding help.
- 02→Continue
Open-source AI code assistant for VS Code and JetBrains, with full BYOK and local-model support.
- 03→Cursor
AI code editor forked from VS Code with built-in agent, multi-file edits, and tab-completion.