Complete Guide to Text Embeddings and Vector Databases

What are Text Embeddings?

Text embeddings are numerical representations of text that capture semantic meaning in a way that computers can process. Instead of treating words as discrete symbols, embedding models convert text into dense vectors (arrays of numbers) where similar meanings are positioned closer together in the vector space.

For example, the phrases "How do I reset my password?" and "I forgot my login credentials" would have similar embedding vectors because they express related concepts, even though they use different words. This semantic understanding makes embeddings fundamental to modern search, recommendation systems, and RAG (Retrieval-Augmented Generation) applications.

Key Embedding Concepts

Dimensions: The size of the vector (e.g., 1536 dimensions). More dimensions can capture more nuance but require more storage.
Semantic Similarity: Embeddings enable measuring how similar two pieces of text are using cosine similarity or dot product.
Context Length: Maximum tokens that can be embedded in a single request (varies by model).
Vector Database: Specialized databases (Pinecone, Weaviate, Qdrant) optimized for storing and querying embeddings.

Understanding Embedding Costs

Embedding costs are calculated based on the number of tokens processed. Unlike LLM inference which has separate input/output pricing, embedding models charge a single rate per token. The formula is straightforward:

Cost = (Total Tokens / 1,000,000) × Price per Million

When planning embedding costs, consider both the initial embedding generation and any re-embedding needs when documents change. Many applications also need to embed user queries at search time, though query volumes are typically much lower than document volumes.

Embedding Model Comparison

Model	Dimensions	Max Tokens	Price/M
text-embedding-3-small	1,536	8,191	$0.02
text-embedding-3-large	3,072	8,191	$0.13
text-embedding-ada-002	1,536	8,191	$0.10
Cohere embed-v3	1,024	512	$0.10
Voyage AI voyage-large-2	1,536	16,000	$0.12

Common Use Cases and Cost Estimates

Knowledge Base (10K documents)

Documents: 10,000
Avg tokens/doc: 500
Total tokens: 5,000,000
Cost (3-small): ~$0.10

E-commerce Catalog (100K products)

Products: 100,000
Avg tokens/product: 200
Total tokens: 20,000,000
Cost (3-small): ~$0.40

Legal Documents (1K contracts)

Documents: 1,000
Avg tokens/doc: 5,000
Total tokens: 5,000,000
Cost (3-small): ~$0.10

Support Tickets (50K/month)

Tickets: 50,000
Avg tokens/ticket: 150
Total tokens: 7,500,000
Cost (3-small): ~$0.15/month

Storage Considerations

Beyond API costs, you need to consider vector storage. Each embedding dimension is typically stored as a 32-bit float (4 bytes). The storage formula is:

Storage (MB) = (Dimensions × 4 bytes × Vector Count) / 1,048,576

Dimensions	1K Vectors	100K Vectors	1M Vectors
768	2.9 MB	292 MB	2.9 GB
1,536	5.9 MB	585 MB	5.9 GB
3,072	11.7 MB	1.2 GB	11.7 GB

Best Practices for Cost Optimization

Optimization Strategies

Choose the Right Model: text-embedding-3-small is 5x cheaper than ada-002 with comparable quality for most use cases.
Batch Requests: Send multiple texts in a single API call to reduce overhead and potentially qualify for batch pricing.
Preprocess Text: Remove unnecessary whitespace, boilerplate, and formatting to reduce token count.
Cache Embeddings: Store embeddings to avoid re-computing for unchanged content.
Use Dimensionality Reduction: Some models support truncating dimensions for reduced storage with minimal quality loss.
Smart Chunking: Split documents intelligently at semantic boundaries rather than arbitrary character counts.

When to Use Different Models

Budget-Focused

Use text-embedding-3-small for general search, FAQs, and content where semantic matching is sufficient. Best for high-volume applications where cost is a primary concern.

Quality-Focused

Use text-embedding-3-large for complex semantic search, legal/medical content, or when subtle meaning differences matter. Worth the extra cost for precision-critical applications.

Code Search

Use specialized models like Voyage Code for code search and understanding. These models are trained on programming content and handle code syntax better than general-purpose models.

Important Considerations

Model Consistency: Always use the same embedding model for documents and queries. Embeddings from different models are not compatible.
Re-embedding Costs: Switching models or upgrading versions requires re-embedding all existing content.
Rate Limits: Check provider rate limits - high-volume batch jobs may require planning around API quotas.

Embedding Cost Calculator

Embedding Model

Input Method

Details

Quick Comparison

Related Tools

Fine-Tuning Cost Calculator

Image Generaton Cost Calc

AI Token Counter

AI Cost Calculator

AI Pricing Comparison

Batch Cost Estimator