Typelevel Frameless Versions Save

Expressive types for Spark.

v0.8.0

5 years ago

New additions:

  • Add column functions: round, signumn by @dlinov

  • Add column functions: log, hypot, pow, pmod by @OlivierBlanvillain

  • Spark 2.4 and Scala 2.12 support by @ceedubs

Bug fixes:

  • Fix implication from SPARK-20346 by @imarios

v0.6.1

6 years ago

Same as v0.5.2 supporting Spark 2.3.0

v0.5.2

6 years ago

Bug fixes:

  • Fix nested collection encoding (by @lesbroot)
  • Fixed record encoding to be compatible with vanilla Spark (by @kmate)
  • Enabled java assertion (by @snadorp)

New additions:

  • add support for .as[A] when A is nested type (by @sullivan-)

v0.6.0

6 years ago

Same as v0.5.1 but with support for Spark 2.3.0 (by @kmate)

v0.5.1

6 years ago

Bug fixes and enhancements:

  • Fixed null handling for nested classes (by @kmate)
  • Better handling of case classes with fields of type Unit (by @kmate)
  • TypedDatasets can now be created from Dataframes with different column ordering (by @mfelsche)
  • Implicit derivation for Orderable types (by @martin1keogh)
  • Generated bytecode classes now correctly target JVM 1.8 (by @imarios)

New method additions:

  • cube and rollup aggregation operators (by @avasil)
  • size for Map and isin for values (by @ayoub-benali)
  • Trigonometric methods: cos, cosh, sin, sinh, tan, tanh (by @avasil)
  • between method for orderable values (by @crossy147)
  • substr (by @bhop)

v0.5.0

6 years ago

Notable additions/changes:

  • Great improvements in joins (theta join condition supported)
  • Extended functionality for frameless-ml (details follow)
  • Unifying projected and aggregated columns
  • Injection for ordered columns
  • Fixed multiplication for BigDecimal
  • Adding a lot of missing operators, such as sort(), union(), drop(), when/otherwise (details follow)
  • More documentation examples

Frameless-ml

  • TypedTransformer and TypedEstimator
  • TypedRandomForrestRegressor, TypedRandomForrestClassifier
  • TypedIndexToString, TypedStringIndexer, TypedVectorAssembler

Encoders:

  • java.math.BigDecimal

Upgrades:

  • Spark to 2.2.1
  • Scala to 2.11.12

Operators:

  • union()
  • sort()
  • asCol to project entire dataset into a single column
  • drop(), dropTupled()
  • getOrElse()
  • when()/otherwise()
  • withColumn(), withColumnReplaced()
  • head(n), headOption
  • litAggr literal for aggregated columns

Column methods:

  • abs, acts, add_months
  • bin, bitwiseNot
  • arrayContains
  • inputFileName, monotonicallyIncreasingId
  • ascii, asin, atan, atan2, base64

v0.4.1

6 years ago

Identical to v0.4.0, but updated to Cats 1.0.1 stable.

  • cats 1.0.1
  • cats-effect 0.8
  • cats-mtl 0.2.2

v0.4.0

6 years ago

Notable additions/changes:

  • support for Spark 2.2.0
  • added Encoders for: UDT, Array, Map
  • added explode() on TypedColumns with types Vector/Array
  • added bitwise and/or/xor operators on TypedColumns
  • added withColumn() operator on TypedDataset
  • added pivot() aggregation
  • added statistical methods: corr(), skewness(), kurtosis(), cover_sample()
  • migrated from SparkContext to SparkSessions throughout
  • created the frameless-ml project
  • parameterize Spark actions over the effect type used (a much more powerful Job[_])
  • [Internal] Improved test template
  • [Internal] Moved methods that are not optimizable by Catalyst to a new package (map(), flatMap(), etc.)
  • [bug] Fixed bug in computing equality of nullable types (Options)
  • [bug] Fixed big-decimal devision incorrectly returning Double

v0.3.0

7 years ago

Notable additions/changes:

  • UDFs now support columns with custom encoders (using Injection)
  • map and flatMap on Job[A]
  • more aggregation functions: countDistinct, approxCountDistinct, collectList, collectSet, sumDistinct
  • support for cats v0.9
  • createUnsafe to instantiate a TypedDataset from a Spark DataFrame
  • whole dataset aggregation functions moved from select to an explicit agg on TypedDataset
  • bug fixes on joins, UDFs, and vector encoders