MediaWiki to Protocol Buffers Converter

Transform MediaWiki tables into Protocol Buffers format with automatic type detection

About MediaWiki to Protocol Buffers Converter

Convert MediaWiki table markup to Protocol Buffers (protobuf) schema and data format. Perfect for data serialization, API development, and microservices communication.

Key Features

  • Automatic Type Detection: Infers field types (string, bool, int32, int64, double)
  • Schema Generation: Creates proto3 syntax schema definitions
  • Sample Data: Includes textproto format sample data (first 100 records)
  • Field Sanitization: Converts MediaWiki headers to valid protobuf field names
  • MediaWiki Parser: Accurately parses MediaWiki table syntax
  • File Upload: Upload .wiki files directly
  • Copy & Download: Easy export as .proto file

How to Use

  1. Input MediaWiki Table: Paste your MediaWiki table markup or upload a .wiki file
  2. Review Output: The Protocol Buffers schema and data generate automatically
  3. Copy or Download: Use the Copy or Download button to save your .proto file

Type Detection

The converter automatically detects field types based on data values:

  • bool: Values matching "true" or "false" (case-insensitive)
  • int32: Integer values within -2,147,483,648 to 2,147,483,647
  • int64: Integer values outside int32 range
  • double: Floating-point numbers with decimal points or scientific notation
  • string: All other values (default type)

Example Conversion

MediaWiki Input:

{| class="wikitable" border="1"
! Name !! Age !! City !! Department
|-
| John Doe || 28 || New York || Engineering
|-
| Jane Smith || 34 || London || Marketing
|}

Protocol Buffers Output:

syntax = "proto3";

package mediawiki_converter;

message MediaWikiRecord {
  string name = 1; // Name
  int32 age = 2; // Age
  string city = 3; // City
  string department = 4; // Department
}

message MediaWikiData {
  repeated MediaWikiRecord records = 1;
}

// Sample data in textproto format
MediaWikiData {
  records {
    name: "John Doe"
    age: 28
    city: "New York"
    department: "Engineering"
  }
  records {
    name: "Jane Smith"
    age: 34
    city: "London"
    department: "Marketing"
  }
}

Common Use Cases

  • API Development: Define data schemas for gRPC services
  • Microservices: Create message formats for service communication
  • Data Serialization: Efficient binary data encoding
  • Cross-Platform: Language-agnostic data structures
  • Documentation: Generate protobuf schemas from wiki tables
  • Data Migration: Convert wiki data to protobuf format

Protocol Buffers Benefits

  • Compact: Binary format is smaller than JSON or XML
  • Fast: Efficient serialization and deserialization
  • Typed: Strong typing prevents data errors
  • Versioned: Forward and backward compatibility
  • Multi-Language: Code generation for 20+ languages
  • Validated: Schema-based validation

Field Name Sanitization

MediaWiki headers are automatically converted to valid protobuf field names:

  • Converts to lowercase with underscores
  • Removes special characters and accents
  • Ensures names start with letters
  • Handles duplicate names with numeric suffixes
  • Generates default names for empty headers

MediaWiki Table Syntax

Supports standard MediaWiki table markup:

  • {|: Table opening with optional attributes
  • ! Header: Exclamation mark for header cells (separated by !!)
  • |-: Row separator
  • | Data: Pipe for data cells (separated by ||)
  • |}: Table closing

Output Format

The converter generates three sections:

  • Schema Definition: proto3 syntax with message types
  • Statistics Comment: Number of rows processed
  • Sample Data: First 100 records in textproto format

FAQ

What types of MediaWiki tables are supported?

This tool supports standard MediaWiki table markup that starts with {| and ends with |}, using ! for headers, |- for row separators, and |/|| for data cells.

How are field types chosen for each column?

The converter inspects non-empty values in each column and tries to classify them as booleans, 32-bit integers, 64-bit integers, doubles, or strings. If all values match a simpler numeric pattern, a numeric protobuf type is used; otherwise string is used.

What happens if I have more columns than headers?

If some columns do not have headers, generic names such as column_1, column_2, etc. are generated. These are then sanitized into valid protobuf field names while keeping a comment with the original header text when available.

Is the generated schema ready to use with protoc?

Yes. The first part of the output is a valid .proto file using syntax = "proto3"; and a mediawiki_converter package. You can save it as a .proto file and compile it with the standard Protocol Buffers compiler.

Is any of my table data uploaded or stored?

No. All parsing and generation happen locally in your browser. Your MediaWiki table content and the resulting schema are never sent to any server, ensuring your data remains private.

Privacy & Security

All conversions happen locally in your browser. Your MediaWiki data is never uploaded to any server, ensuring complete privacy and security.