Spark Acid Versions Save

ACID Data Source for Apache Spark based on Hive ACID

v0.6.0

3 years ago

Features/Improvements

  • Add MERGE command Support. (issue-26)
  • Performance improvements in acid writer. (issue-88)
  • Parallelizing Split computation for Acid Tables. (issue-78)
  • Add Partition Pruner as a fallback when HMS pruning fails. (issue-21)
  • Don't allocate writeIds for Read transactions.
  • Reduce wait time for lock acquisition to around 5 mins by default and make it configurable. (issue-55)

Notable Bug Fixes:

  • Fix rowId issue when pushdown of projections or aggregation. (issue-46)
  • Fix issues related to Dynamic Partitions. (issue-44, issue-76)
  • Fix the repartitioning logic to handle statement IDs. (issue-83)
  • Fixes bucket ID bug in update/delete operation on acid table. (issue-92)
  • Fix the Heartbeat Runner. (issue-54)
  • Update on partition columns are disallowed. (issue-59)

v0.5.0

4 years ago

Major Changes:

  • Support for INSERT into FullACID and InsertOnly Tables in ORC.
  • Support for UPDATE and DELETE for Full ACID Table in ORC including the SQL support.
  • Support for Spark Native Reader for ACID tables in ORC
  • Support for Structured Streaming Writes into ACID tables and using ACID tables as streaming sink

Notable Improvements:

  • Adding check for Transaction validity after it acquires lock, so that it is working on consistent states of table
  • Support for dynamically setting version for Hive and Spark
  • Fix reading ACID tables with row ids.
  • Support for Writing Date and TimeStamp columns.

Credit: Rajkumar Iyer, Amogh Margoor, Abhishek Dixit, Megha Thakkar, Herbert Liao, Mahesh Kumar Behera