Annotation Converter
Convert NER annotations between IOB, spaCy, Label Studio, and CoNLL formats
Format Converter
Related Tools
Data Augmentation Preview
Visualize image augmentation techniques for training data
Chat Data Formatter
Convert chat logs between ShareGPT, OpenAI, and Alpaca formats
Dataset Splitter
Split datasets into train, validation, and test sets with stratification
JSONL Converter
Convert between JSON and JSONL formats for fine-tuning comparisons
PII Detector
Identify and redact Personally Identifiable Information in datasets client-side
Synthetic Data Generator
Generate synthetic examples based on schema instructions
What is NER Annotation Conversion?
Named Entity Recognition (NER) is a fundamental NLP task that identifies and classifies named entities in text—such as people, organizations, locations, and dates. Different tools and frameworks use different annotation formats, making it challenging to share datasets or switch between platforms.
This annotation converter transforms NER training data between popular formats: IOB/BIO (the standard token-level format), spaCy's JSON format, Label Studio's output format, and CoNLL. Convert your annotations without manual reformatting.
Supported Annotation Formats
IOB/BIO Format
Token-per-line format with B- (beginning), I- (inside), and O (outside) tags. Standard for most NER research and training.
spaCy Format
JSON with full text and entity spans (start, end, label). Required for training spaCy NER models.
Label Studio Format
JSON with value objects containing start, end, text, and labels. Export format from Label Studio annotation tool.
IOB Tag Meanings
| Tag | Meaning | Example |
|---|---|---|
| B-PER | Beginning of person name | John |
| I-PER | Inside (continuation) of person | Smith (after John) |
| B-ORG | Beginning of organization | |
| O | Outside (not an entity) | works, at, the |
Common Entity Types
- PER — Person names
- ORG — Organizations
- LOC — Locations
- DATE — Dates and times
- MONEY — Monetary values
- PRODUCT — Product names
Frequently Asked Questions
What's the difference between IOB and IOB2?
IOB2 (used here) always uses B- for the first token of an entity. Original IOB only uses B- when two entities of the same type are adjacent.
Why use spaCy format over IOB?
spaCy's span-based format preserves the original text and whitespace exactly. IOB tokenization can lose information about spacing and punctuation.
Can I convert Label Studio exports directly?
Yes, paste the annotations array from Label Studio's export and convert to your target format. This tool handles the standard NER task output format.
