MediaWiki to Pandas DataFrame Converter

Transform MediaWiki tables into Python Pandas DataFrame code for data analysis

About MediaWiki to Pandas DataFrame Converter

Convert MediaWiki table markup to Python Pandas DataFrame code with multiple format options. Perfect for data analysis, machine learning, and scientific computing with Python.

Key Features

  • Multiple Formats: Dictionary of lists, list of lists, or list of dictionaries
  • Type Detection: Automatically detects integers, floats, booleans, and strings
  • Column Sanitization: Converts headers to valid Python identifiers
  • Custom Variable Names: Choose your DataFrame variable name
  • Import Statement: Optional pandas import statement
  • None Handling: Properly handles empty cells as None
  • Copy & Download: Easy integration into Python scripts

How to Use

  1. Input MediaWiki Table: Paste your MediaWiki table markup or upload a .wiki file
  2. Choose Format: Select your preferred DataFrame creation format
  3. Configure Options: Set variable name and import preferences
  4. Copy or Download: Use the Python code in your data analysis scripts

DataFrame Formats

  • Dictionary of Lists: Most efficient format, columns as dictionary keys with list values
  • List of Lists: Traditional format with headers as first row
  • List of Dictionaries: Records format, each row as a dictionary

Example Conversion

MediaWiki Input:

{| class="wikitable" border="1"
! Name !! Age !! City !! Salary
|-
| John Doe || 28 || New York || 75000
|-
| Jane Smith || 34 || London || 82000
|}

Python Output (Dictionary of Lists):

import pandas as pd

df = pd.DataFrame({
    'Name': ['John Doe', 'Jane Smith'],
    'Age': [28, 34],
    'City': ['New York', 'London'],
    'Salary': [75000, 82000]
})

Common Use Cases

  • Data Analysis: Import Wikipedia tables for analysis in Jupyter notebooks
  • Machine Learning: Prepare wiki data for ML models
  • Scientific Computing: Use wiki tables in scientific Python workflows
  • Data Visualization: Create plots from wiki table data
  • Statistical Analysis: Perform statistical operations on wiki data
  • ETL Pipelines: Extract wiki data for data pipelines

Type Detection

Automatically detects and converts data types:

  • Integers: Whole numbers (e.g., 42, -10)
  • Floats: Decimal numbers (e.g., 3.14, -0.5)
  • Booleans: True/False values
  • Strings: Text values with proper escaping
  • None: Empty cells or null/none values

Column Name Sanitization

Converts MediaWiki headers to valid Python identifiers:

  • Replaces special characters with underscores
  • Ensures names don't start with numbers
  • Removes duplicate underscores
  • Provides fallback names if needed

Pandas DataFrame Operations

Once converted, you can use all Pandas operations:

# Display DataFrame
	print(df)

# Basic statistics
print(df.describe())

# Filter rows
filtered = df[df['Age'] > 30]

# Sort by column
sorted_df = df.sort_values('Salary', ascending=False)

# Group by and aggregate
grouped = df.groupby('City')['Salary'].mean()

FAQ

What types of MediaWiki tables are supported?

This tool supports standard MediaWiki table markup that starts with {| and ends with |}, using ! for headers, |- for row separators, and |/| for data cells.

How are data types inferred for each cell?

The converter looks at each cell value and classifies it as integer, float, boolean, string, or None-like (empty, "null", or "none"). It then formats values appropriately for Python so that Pandas can interpret them correctly when building the DataFrame.

What happens if some rows have missing values?

Empty cells are converted to None in the generated Python code. When the DataFrame is created, Pandas will treat these as missing values (NaN for numeric columns), allowing you to use standard missing-data handling functions.

Can I change the DataFrame variable name?

Yes. You can set a custom variable name in the options panel. The generated code will assign the DataFrame to that variable so you can integrate it directly into your existing scripts or notebooks.

Is any of my table data uploaded to a server?

No. All parsing and conversion happen locally in your browser. Your MediaWiki table content is never sent to any server, ensuring your data remains private.

Privacy & Security

All conversions happen locally in your browser. Your MediaWiki data is never uploaded to any server, ensuring complete privacy and security.