🦀 Parliamentary Polarization Map

Elite polarization,
measured from parliamentary speech

An interactive map of how legislators talk about their own and rival parties, built from millions of political mentions extracted across national parliaments. It includes the world map, country trajectories, cross-national comparisons and the underlying data.

How the map is built

This project aggregates parliamentary data worldwide to measure elite polarization by analyzing what politicians say about one another inside parliaments. It employs an agent-AI-driven framework for parliamentary data collection, homologation, and analysis, orchestrated through the OpenClaw framework.

The system collects transcripts from academic archives across many countries. When corpora exist only on parliamentary websites, it uses web-scraping techniques to acquire them. Because many parliaments publish proceedings as PDF scans, the pipeline separates born-digital PDFs, where text is extracted directly, from scanned-image PDFs, where optical character recognition is applied. For video-only records, it extracts official captions or subtitles; otherwise, it tests automated speech recognition on a sample before transcription.

OpenClaw coordinates the workflow across collection, normalization, affiliation recovery, extraction, validation, and escalation. Collected transcripts are normalized into a unified format in which each row is one speaker’s turn, with country, date, speaker name, party, and speech text. Since speaker–party affiliation is often missing, a recovery algorithm uses registry and metadata joins, deterministic date-window rules for known affiliations over time, and fuzzy matching against biographical and parliamentary records. Unresolved cases then fall back to language-model lookup.

The extraction stage uses the Google Gemma 4 26B LLM to identify passages where a speaker evaluates another named politician, distinguishing attacks or praise from neutral and procedural references. Each mention is coded as in-group or out-group using recovered coalition affiliation. The output is a cross-national dataset of elite-hostility relationships, aggregated by party-pair and time period, covering over 42 million speeches across 82 countries.

How the corpus was acquired

82 corpora · 42.0 million speeches, grouped by how each parliament’s record was obtained.

Ready-downloadincl. ParlaMint releases
56
37.2M speeches
Web-scrapecustom site scrapers & APIs
13
3.7M speeches
PDF / OCRborn-digital + scanned-image text
13
1.0M speeches

World map

Choropleth of any indicator, pooled across all years (mention-weighted) or for a single year.

Country profile

Each party's trajectory within a country — out-party evaluation by default. Countries without a party breakdown show the national aggregate.

Compare indicators

Scatter two indicators against each other for a given year. Bubble size = corpus volume; colour = region.

Heat map

Country × year grid for one indicator. Grey cells have no data.

Data table

The underlying panel. Switch between country-level and party-level rows; filter, search and sort.