Abstract Algebra for Scala
This is the first Algebird release to publish Scala 2.12 artifacts! Apart from that, here are some of the changes since the last release:
Various BloomFilter improvements: Remove seed variable in BloomFilter and rename k to hashIndex: https://github.com/twitter/algebird/pull/557 Polymorphic Bloom filters: https://github.com/twitter/algebird/pull/607 Optimize BloomFilter significantly: https://github.com/twitter/algebird/pull/610 Bloom filter distance function: https://github.com/twitter/algebird/pull/612 Optimize Hamming distance for Bloom Filters: https://github.com/twitter/algebird/pull/617
Incorporate more Algebra types: Use standard algebra types: https://github.com/twitter/algebird/pull/523 Use more algebra types - https://github.com/twitter/algebird/pull/620
SpaceSaver updates: Widen the visibility of SpaceSaver.SSMany, SpaceSaver.SSOne: https://github.com/twitter/algebird/issues/577 SpaceSaver fromBytes & toBytes: https://github.com/twitter/algebird/pull/603 Catch OOM in SpaceSaverTest: https://github.com/twitter/algebird/pull/614
Remove typeclass from interval constructor: https://github.com/twitter/algebird/pull/605
Better toString in ExpHistogram: https://github.com/twitter/algebird/pull/604
Remove legacy CountMinSketchMonoid: https://github.com/twitter/algebird/pull/602
Convert all laws to take Equiv instances, deprecate Equiv versions: https://github.com/twitter/algebird/pull/595
replace FromIntLike with Ring and toK function: https://github.com/twitter/algebird/pull/594
bail out of SemigroupMacro.sumOption for to.isEmpty
: https://github.com/twitter/algebird/pull/599
Handle empty in Generated{Product, Abstract}Algebra: https://github.com/twitter/algebird/pull/597
Add explicit return types to Group instances for Moments, AveragedValue: https://github.com/twitter/algebird/pull/596
Remove view bounds on Moments, DecayedValue, AveragedValue: https://github.com/twitter/algebird/pull/592
Add MonoidAggregator.collectBefore: https://github.com/twitter/algebird/pull/611
Thanks to @johandahlberg , @johnynek , @ElPicador , @sritchie , @isnotinvain for the contributions!
This is an early release of some Scala 2.12 Algebird packages that contains some binary incompatible changes. Please pick up release: https://github.com/twitter/algebird/releases/tag/0.13.0. That contains the appropriate set of Scala 2.12 Algebird artifacts.
The main new feature of this release is a faster (benchmarked!) implementation of tuple and product semigroup sumOptions. This means if you are aggregating on scalding or spark, you should see a significant (~ 2x faster).
There is a new Set
membership monoid called SetDiff
. It can model adding and removing from sets (which can be useful for applications in summingbird).
We have an exponential histogram Fold, which is an approximate data-structure that can tell you approximate counts over sliding windows (see #568). A future work will add a monoid for this type, however when possible, using the Fold is better since it has better error properties.
Lastly, there are many new docs.
Huge thanks to @sritchie who was the main contributor to this release.
SetDiff
data structure to algebird-core
: https://github.com/twitter/algebird/pull/555
Ring[BigDecimal]
, modeled after Ring[BigInt]
: https://github.com/twitter/algebird/pull/553
algebird-core
as ExpHist
: https://github.com/twitter/algebird/pull/568
sbt-microsites
plugin, along with docs for all typeclasses and data structures: https://github.com/twitter/algebird/pull/576
Arbitrary
and Gen
instances to algebird-test
, under com.twitter.algebird.scalacheck.{ gen, arbitrary }
: https://github.com/twitter/algebird/pull/579
Monoid[Max[Vector[T]]]
, Monoid[Max[Stream[T]]]
: https://github.com/twitter/algebird/pull/579
FirstAggregator
and LastAggregator
, and docs and API / perf improvements for First
, Last
, Min
, Max
: https://github.com/twitter/algebird/pull/579
LawsEquiv
versions of all laws: https://github.com/twitter/algebird/pull/584
Future
/Try
: https://github.com/twitter/algebird/pull/584
metricsLaws[T]
to BaseProperties
in algebird-test
: https://github.com/twitter/algebird/pull/584
Tuple2Monoid
, etc to extend TupleNSemigroup
, giving subclasses access to efficient sumOption
: https://github.com/twitter/algebird/pull/585
Generated{Abstract,Product}Algebra.sumOption
with benchmarking https://github.com/twitter/algebird/pull/591
sumOption
, +
, -
, methods and docs to AveragedValue
: https://github.com/twitter/algebird/pull/589
This is an optimization and bug-fix release that is compatible with 0.12.x
. We add two new features: Semigroup.maybePlus[T](t: T, o: Option[T]): T
and Aggregator.numericSum
to convert to double and and sum from any scala.math.Numeric
.
The full change log is below. Thanks to all contributors!
CMS.create(Seq[K])
#537This release adds many convenience methods to Aggregator
, adds a new type called Batched[T]
, and speeds up CMS.
Aggregator now has methods for reservoir sampling, and more top-K (sort*Take) aggregators. Batched allows you to defer doing any work on plus
until you have a certain size, then it calls sumOption
internally. This is designed for aggregations that are expensive to do iteratively, but sumOption can be made efficient. Lastly, CMS was significantly improved in performance, a sumOption method was added, and a mutable builder (CMSSummation) was added (see #533).
This release should be 100% binary compatible with 0.12.0
(this check is now part of the travis-ci checks we run).
Thank you to: @joshualande @non @dossett @jnievelt @piyushnarang @koertkuipers @Gabriel439 @NathanHowell @johnynek @ianoc
Version 0.11.0
Move CMSHasherByteArray from scalding: https://github.com/twitter/algebird/pull/467 Upgrade sbt launcher script (sbt-extras): https://github.com/twitter/algebird/pull/469 Create case class macros for algebraic structures: https://github.com/twitter/algebird/pull/466 Refactor MapAggregator: https://github.com/twitter/algebird/pull/462 Algebird support for spark: https://github.com/twitter/algebird/pull/397 Add MapAggregator from 1 (key, aggregator) pair: https://github.com/twitter/algebird/pull/452 Remove unnecessary use of scala.math: https://github.com/twitter/algebird/pull/455 Don't call deprecated HyperLogLog methods in tests: https://github.com/twitter/algebird/pull/456 Update product_generators.rb: https://github.com/twitter/algebird/pull/457 Pzheng/gaussian euclidean: https://github.com/twitter/algebird/pull/448