Best 18 Deduplication Open Source Projects

Borg

Deduplicating archiver with compression and authenticated encryption.

Restic

Fast, secure, efficient backup program

Borgmatic

Simple, configuration-driven backup software for servers and workstations

Rdedup

Data deduplication engine, supporting optional compression and public key encryption.

Kopia

Cross-platform backup tool for Windows, macOS & Linux with fast, incremental backups, client-side end-to-end encryption, compression and data deduplication. CLI and GUI included.

Libpostal

A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.

Alertmanager

Prometheus Alertmanager

Dupeguru

Find duplicate files

Rmlint

Extremely fast tool to remove duplicates and other lint from your filesystem

Jdupes

A powerful duplicate file finder and an enhanced fork of 'fdupes'.

Talisman

Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.

Recordlinkage

A toolkit for record linkage and duplicate detection in Python

Data Matching Software

A list of free data matching and record linkage software.

LSH

Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents

Kvdo

A pair of kernel modules which provide pools of deduplicated and/or compressed block storage.