Cosine Similarity Calculator
Calculate and visualize cosine similarity between vectors
Vector Inputs
Dimensions: 3
Dimensions: 3
Result
0.7071
Similar
Strong positive correlation
Angle between vectors: 45.0°
Calculation Steps
Dot Product (A · B):
0.7000
Magnitude of A (||A||):
1.0000
Magnitude of B (||B||):
0.9899
Cosine Similarity = (A · B) / (||A|| × ||B||):
0.7000 / (1.0000 × 0.9899) = 0.7071
Related Tools
Embedding 3D Visualizer
Visualize high-dimensional embeddings in 2D/3D using PCA/t-SNE simulation
Perplexity Explainer
Calculate and understand perplexity from probability distributions
ROUGE Score Calculator
Calculate ROUGE-N and ROUGE-L scores for summarization tasks
Temperature Visualizer
Visualise how temperature and top-p sampling affect next-token probabilities
Tokenization Visualizer
See how text is broken down into tokens by different tokenizers (BPE, WordPiece)
Nucleus Sampling (Top-p) Demo
Interactive demo explaining how Nucleus Sampling filters token selection
What is Cosine Similarity?
Cosine similarity measures how similar two vectors are by calculating the cosine of the angle between them. It ranges from -1 (opposite) through 0 (orthogonal) to +1 (identical direction).
Unlike Euclidean distance, cosine similarity ignores magnitude and focuses only on direction. This makes it ideal for comparing text embeddings where vector length varies.
The Formula
cos(θ) = (A · B) / (||A|| × ||B||)
A · B = dot product (sum of element-wise products)
||A|| = magnitude (square root of sum of squares)
Use Cases in AI
🔍 Semantic Search
Find documents similar to a query by comparing embedding vectors.
📚 RAG Systems
Retrieve relevant context for LLM prompts based on similarity.
🎯 Recommendations
Find similar items based on feature vectors or user preferences.
📊 Clustering
Group similar documents or data points together.
Pro Tip: Threshold Selection
For semantic search, a similarity threshold of 0.7-0.8 often works well. Adjust based on your use case: higher for precision, lower for recall.
Frequently Asked Questions
Why use cosine similarity instead of Euclidean distance?
Cosine similarity is insensitive to vector magnitude, focusing only on direction. This is important for text embeddings where document length can vary but meaning should be compared.
What dimensions do embedding models use?
Common dimensions: OpenAI text-embedding-3-small uses 1536, large uses 3072. Smaller models might use 384 or 768 dimensions.
Related Tools
Embedding Visualizer
Visualize high-dimensional embeddings.
Vector Dimensions
Compare embedding model dimensions.
