MediaWiki to Pandas DataFrame Converter

Transform MediaWiki tables into Python Pandas DataFrame code for data analysis

MediaWiki Input

Python Output

About MediaWiki to Pandas DataFrame Converter

Convert MediaWiki table markup to Python Pandas DataFrame code with multiple format options. Perfect for data analysis, machine learning, and scientific computing with Python.

Key Features

  • Multiple Formats: Dictionary of lists, list of lists, or list of dictionaries
  • Type Detection: Automatically detects integers, floats, booleans, and strings
  • Column Sanitization: Converts headers to valid Python identifiers
  • Custom Variable Names: Choose your DataFrame variable name
  • Import Statement: Optional pandas import statement
  • None Handling: Properly handles empty cells as None
  • Copy & Download: Easy integration into Python scripts

How to Use

  1. Input MediaWiki Table: Paste your MediaWiki table markup or upload a .wiki file
  2. Choose Format: Select your preferred DataFrame creation format
  3. Configure Options: Set variable name and import preferences
  4. Copy or Download: Use the Python code in your data analysis scripts

DataFrame Formats

  • Dictionary of Lists: Most efficient format, columns as dictionary keys with list values
  • List of Lists: Traditional format with headers as first row
  • List of Dictionaries: Records format, each row as a dictionary

Example Conversion

MediaWiki Input:

{| class="wikitable" border="1"
! Name !! Age !! City !! Salary
|-
| John Doe || 28 || New York || 75000
|-
| Jane Smith || 34 || London || 82000
|}

Python Output (Dictionary of Lists):

import pandas as pd

df = pd.DataFrame({
    'Name': ['John Doe', 'Jane Smith'],
    'Age': [28, 34],
    'City': ['New York', 'London'],
    'Salary': [75000, 82000]
})

Common Use Cases

  • Data Analysis: Import Wikipedia tables for analysis in Jupyter notebooks
  • Machine Learning: Prepare wiki data for ML models
  • Scientific Computing: Use wiki tables in scientific Python workflows
  • Data Visualization: Create plots from wiki table data
  • Statistical Analysis: Perform statistical operations on wiki data
  • ETL Pipelines: Extract wiki data for data pipelines

Type Detection

Automatically detects and converts data types:

  • Integers: Whole numbers (e.g., 42, -10)
  • Floats: Decimal numbers (e.g., 3.14, -0.5)
  • Booleans: True/False values
  • Strings: Text values with proper escaping
  • None: Empty cells or null/none values

Column Name Sanitization

Converts MediaWiki headers to valid Python identifiers:

  • Replaces special characters with underscores
  • Ensures names don't start with numbers
  • Removes duplicate underscores
  • Provides fallback names if needed

Pandas DataFrame Operations

Once converted, you can use all Pandas operations:

# Display DataFrame
print(df)

# Basic statistics
print(df.describe())

# Filter rows
filtered = df[df['Age'] > 30]

# Sort by column
sorted_df = df.sort_values('Salary', ascending=False)

# Group by and aggregate
grouped = df.groupby('City')['Salary'].mean()

Privacy & Security

All conversions happen locally in your browser. Your MediaWiki data is never uploaded to any server, ensuring complete privacy and security.