🚧 UNDER CONSTRUCTION 🚧

🌐 Open Politics

Open Source Political Intelligence

Open Politics Political Intelligence Vision

Onboarding:

Open Politics' vision is to democratise political intelligence.

The mission is to create an open-source data science and AI toolkit to analyse, summarise, and visualise political information.

Why Open Politics Exists
Update: SSARE Release
The Webapp
Engage! Developer Jour Fixe
Tasks
AI Models
Data Challenges
Journalistic Challenges
Quality Assurance
Architecture
Usage
Necessary API Keys/Environment Variables
Contributing
Contact
License

As a quick pitch element to get you interested; this is where we are going:

Open Politics Vision

Why Open Politics Exists

All things regarding politics, be they news, conflicts or legislative procedures, are hard to keep track of. It's hard to find the time to read through all the documents and news articles necessary to gain a broad and well-informed understanding of political situations. Technology offers great possibilities to make such processes more accessible. Recently, the advent of Large Language Models has extended the capabilities of textual analysis and understanding. Especially the ability to formulate tasks in natural language opens up new possibilities for analysing text data. Potentially revolutionising the way qualitative and quantitative research can be combined.
This project aims to combine the best of natural language LLM interfacing and classical Data Science methods to build tools that provide a comprehensive overview of political topics, including summaries of news articles, information about political actors, and the relationships between them.
The goal of this project is to make politics more accessible and understandable for everyone.

Update: SSARE Release

SSARE - (Semantic Search Article Recommendation Engine)

SSARE is Open Politics' data aggregation system and vector storage endpoint. It aims to create up-to-date and relevant datasets for the LLMs to work with. A microservice infrastructure continuously scrapes news sites and stores them in a vector storage and a relational database (Postgres). Sources can be added with Python scripts which yield a dataframe with: URL | Headline | Paragraphs | Source. Just clone the service, add your scripts and bring your own data endpoint into production.

The Webapp (Overview):

25.06.2024: The Django stack is now replaced by a FastAPI backend + NextJS Frontend. The stack setup & is largely based on Tiangolo's Fullstack Template. Security and configuration updates will be fetched from the template.

Backend: FastAPI, Postgres, Adminer
The API's can be generated into OpenAPI clients. For the prototype some static data is served to illustrate future purposes (like the leaders of countries, articles on the globe, etc.). The rest of the data is pulled from APIs, SSARE (https://github.com/JimVincentW/SSARE) and storages like Tavily, Bundestag API, OECD API, etc.

The methodic layer of what information is presented and when is currently buried in the frontend with the help of the Vercel AI SDK. The plan is to dynamically choose data analysis and presentation methods that are presented as streaming generative interfaces.

Once it is more clear how FastAPI can provide the same capabilities as the Vercel AI SDK, the methodic layer will be moved to the backend.

Frontend: The frontend is now a Next.js app using shadcn as a UI library, amcharts for the globe and axios data fatches to the backend. The roamdap foresees that Open Politics will provide a unified interface for data resource management.

Want to engage? Look into our Developer Jour Fixe!

Interested in the project? Want to contribute? Share a thought?
Every Wednesday 15:30 Berlin Time
Discord Server Join and talk about the project, ask questions, propose ideas, or just listen in.
Currently needed:
Data Scraper Modules
Interdisciplinary collaboration on the instruction sets for the LLMs
Prompt Engineering suggestions
Frontend/UX/UI work

Tasks

MVP Elements:

Issue Area Identification
Actor Identification / Named Entity Recognition
Stance Triangulation

Including but not limited to tasks like:

Information summarization
Vector storage & retrieval
Information clustering
Entity Extraction (Named Entity Recognition)
Q&A Chatbots (for interactive information)
Providing historical context
Statement & Intention decoding
Visual representation of political data
Monitoring and alerts
Fact-checking (information triangulation)

AI Models

Open-Source is our friend.
Developing consistent and reliable AI methods is hard with API based models. We thus aim to use open-source models and frameworks, e.g. Ollama and Huggingface for model inference and Langchain for prompt engineering.
More concrete information on the actual setup is laid out in the Architecture section.

Data Challenges

Addressing training data bias
Effective prompt engineering
Creating Datasets & Benchmarks
Robust and scalable data pipelines
Training and fine-tuning LLMs

Journalistic Challenges

Balanced News Sources
Fact-checking
Interdependence of news sources
Interdependence of summaries

Quality Assurance

Automatic Evaluation of Results Pipelines needed (conciseness, accuracy, bias weighting etc.)

Architecture

Frontend

The Challenge

A fundamental challenge this project has to tackle is how to make information and insights accessible.

Vision

The "Open Globe" Interface is set out to display articles and events interactively on a globe. It should enable to browse global news exploratively. Heatmap events, timeseries/ timeline scrolling and filter for issue areas are also on the roadmap.

Current UI