Install GoldenMatch from PyPI with pip. Optional extras add embeddings, LLM scoring, database sync, and more.
pip install goldenmatch
Requires Python 3.11 or later. Core dependencies: Polars, RapidFuzz, Typer, Pydantic, Textual.
pip install goldenmatch[embeddings] # sentence-transformers + FAISS
pip install goldenmatch[llm] # Claude/OpenAI for LLM scoring
pip install goldenmatch[postgres] # PostgreSQL database sync
pip install goldenmatch[snowflake] # Snowflake connector
pip install goldenmatch[bigquery] # BigQuery connector
pip install goldenmatch[databricks] # Databricks connector
pip install goldenmatch[salesforce] # Salesforce connector
pip install goldenmatch[duckdb] # DuckDB out-of-core backend
pip install goldenmatch[quality] # GoldenCheck data quality scanning
pip install goldenmatch[ray] # Ray distributed backend
Install multiple extras at once:
pip install goldenmatch[embeddings,llm,postgres]
docker pull ghcr.io/benzsevern/goldenmatch:latest
# Run a dedupe
docker run --rm -v $(pwd):/data ghcr.io/benzsevern/goldenmatch:latest \
dedupe /data/customers.csv --output-dir /data/results
# Start the REST API
docker run --rm -p 8080:8080 -v $(pwd):/data ghcr.io/benzsevern/goldenmatch:latest \
serve --file /data/customers.csv --port 8080
Pre-built packages for the SQL extension (separate from the Python package):
# Debian/Ubuntu
sudo dpkg -i goldenmatch-pg-0.1.0-pg16-amd64.deb
sudo systemctl restart postgresql
# RHEL/Fedora
sudo rpm -i goldenmatch-pg-0.1.0-pg16.x86_64.rpm
sudo systemctl restart postgresql
Download .deb and .rpm from the goldenmatch-extensions releases page.
pip install goldenmatch-duckdb
import duckdb, goldenmatch_duckdb
con = duckdb.connect()
goldenmatch_duckdb.register(con)
con.sql("SELECT goldenmatch_score('John', 'Jon', 'jaro_winkler')")
pip install dbt-goldenmatch
The dbt-goldenmatch package provides macros for running entity resolution inside dbt pipelines using DuckDB.
goldenmatch --version
# goldenmatch 1.1.1
goldenmatch demo
# Runs a built-in demo with sample data
import goldenmatch as gm
print(gm.__version__) # "1.1.1"
| Variable | Purpose |
|---|---|
OPENAI_API_KEY |
LLM scorer and LLM boost (OpenAI) |
ANTHROPIC_API_KEY |
LLM scorer (Claude) |
DATABASE_URL |
PostgreSQL connection string for sync / watch |
GOOGLE_APPLICATION_CREDENTIALS |
Vertex AI embeddings (GCP service account) |
Run the interactive wizard to configure GPU mode, API keys, and database connections:
goldenmatch setup
The wizard guides you through:
~/.goldenmatch/settings.yaml