Confusion Matrix

Build and analyze confusion matrices for classification

Actual

Predicted

6 predictions

Confusion Matrix

Pred +

Pred -

Actual +

True Pos

False Neg

Actual -

False Pos

True Neg

66.7%

Accuracy

66.7%

Precision

66.7%

Recall

66.7%

F1 Score

Related Tools

Evaluation Harness Config

Generate configuration files for LM Evaluation Harness

Human Eval Form

Create grading rubrics and forms for human evaluation of LLM outputs

Latency Benchmark Recorder

Record and visualize latency metrics from your own API tests

Model A/B Test Evaluator

Analyze results from model A/B tests for statistical significance

BLEU & ROUGE Calculator

Calculate standard text generation metrics between reference and hypothesis

What is a Confusion Matrix?

A confusion matrix is a table that visualizes the performance of a classification model. It shows where the model gets confused—predicting one class when the actual class was different—hence the name.

This tool helps you build a confusion matrix interactively and automatically calculates key metrics like accuracy, precision, recall, and F1 score.

Matrix Components

True Positive (TP)

Actually positive, predicted positive. Model correctly identified the positive class.

True Negative (TN)

Actually negative, predicted negative. Model correctly identified the negative class.

Metric Formulas

Metric	Formula	Meaning
Accuracy	(TP+TN)/(TP+TN+FP+FN)	Overall correctness
Precision	TP/(TP+FP)	Positive prediction quality
Recall	TP/(TP+FN)	Positive detection rate
F1	2(PR)/(P+R)	Harmonic mean of P&R

FAQ

When is accuracy misleading?

With imbalanced classes (e.g., 95% negative), predicting all negative gives 95% accuracy but 0% recall for positives.

Precision vs Recall tradeoff?

High precision = fewer false alarms. High recall = fewer missed cases. F1 balances both. Choose based on whether FP or FN is worse.