Similarity Uniform Fuzzy Hash Versions Save

Similarity algorithm (computes the similarity between two files as a 0 to 1 score) with linear complexity, based on context triggered piecewise (fuzzy) hashes.

1.8.4

6 years ago

1.8.4 release.

Java and libs update.

Added ToStringUtils unescapeCsv and splitCsv methods. ToStringUtils static methods are now public.

Updated Maven and JDK.

Multiple changes in the Java classes.

-Added the methods computeHashToHashesSimilarities (computes the similarities between a hash and a map of hashes and returns them as a map) and computeAllHashesSimilarities (computes the similarities between all the hashes in a map and returns them as a map) to the class UniformFuzzyHashes.java.

-Added the methods sortSimilarities (sorts a map of similarities between a hash and a map of hashes by a type of similarity) and sortMap (sorts any map by the same order as other sorted map).

-Hash statistics and characteristics removed due to underusage.

-Similarity cache removed from the class UniformFuzzyHash.java, it is not necessary anymore since from now on it can be managed outside the class thanks to the new methods computeHashToHashesSimilarities and computeAllHashesSimilarities.

-SimilarityTypes enum moved to the class UniformFuzzyHash.java from the class UniformFuzzyHashes.java. Added the methods similarity(hash, similarityType) (computes and returns a type of similarity between two hashes) and similarities (computes all the types of similarities between two hashes and returns them as a map) to the class UniformFuzzyHash.java.

-All the methods using a map of names (string) -> objects in the class UniformFuzzyHashes.java changed to a generic map of identifiers (any type) -> objects.

-Methods printHashToHashesSimilaritiesTable, saveHashToHashesSimilaritiesAsCsv, printAllHashesSimilaritiesTable and saveAllHashesSimilaritiesAsCsv in the class UniformFuzzyHashes.java changed to receive a map of similarities.

-Added the methods collectionToMap (builds a map of identifiers -> objects from a collection of objects, identifying them by index), mapValuesToList (builds a list from the values in a map) and mapKeysToList (builds a list from the keys in a map) to the class UniformFuzzyHashes.java.

-Removed the methods using collections instead of maps in the class UniformFuzzyHashes.java, they are not necessary anymore thanks to the new methods collectionToMap, mapValuesToList, mapKeysToList and sortMap.

1.7.1

6 years ago

1.7.1 release.

Hash rebuild from string representation optimized.

Hash string representation changed to base36 (alphanumeric). New look:

101:yvvmzi/1p-w9haa5/1h-6ccqw2/1u-bzhsr1/f-bhias1/4p-naflv2/o-bzr4g9/65-cq8afb/a7-nqtlg6/o

Hash ascii representation removed.

Similarity caching is now optional.

BlocksSet and SimilaritiesCache now are not built until they are used.

1.6.1

6 years ago

1.6.1 release.

Similarity tables to CSV.

Reading and writing hashes and csv files line by line.

1.5.1

6 years ago

1.5.1 release.

MarkAbove and MarkBelow.

Released to sonatype and maven central.

GroupId changed to com.github.s3curitybug. Main package changed to com.github.s3curitybug.similarityuniformfuzzyhash. Pom: distribution management, nexus staging plugin. Pom: source, javadoc and gpg maven plugins. Pom: description, url, inceptionYear, developers, license, scm, issueManagement. License.txt and Notice.txt added to meta-inf.

1.4

6 years ago

1.4 release.

Hash Ascii Representation. Factor Must Be Odd.

1.3

7 years ago

1.3 release.

Command line help update. Multiple arguments for option --compareToAll.

1.2

7 years ago

1.2 release.

Similarity Types.

1.1

7 years ago

1.1 release.

New command line interface.

1.0

7 years ago

1.0 release.