Locality Sensitive Hashing using MinHash in Python/Cython to detect near...
Filter, Sort & Delete Duplicate Files Recursively
Productivity improvements for Rust ecosystem: warnings are skipped until...
A kernel module which provide a pool of deduplicated and/or compressed b...
Deduplicating archiver with encryption and paranoid-level tests. Swiss a...
Userspace tools for managing VDO volumes.
Framework and command-line tools for integrating FollowTheMoney data str...
Fast block-level out-of-band BTRFS deduplication tool.
Quickly detect already witnessed data.
CLI utility to find near duplicate images and remove all but the best copy.
Make it easier to compare and cross-reference the names of companies and...
PyTorch library for transforming entities like companies, products, etc....
Benji Backup: A block based deduplicating backup software for Ceph RBD ...
FastCDC implementation in Rust
Spark RDD with Lucene's query and entity linkage capabilities