An immutable database for application development and time-travel data compliance, with SQL and XTQL. Developed by @juxt
XTDB 1.24.3 is now available!
This is a bugfix release for the jdbc and kafka modules.
;; lein
[com.xtdb/xtdb-core "1.24.3"]
;; deps.edn
com.xtdb/xtdb-core {:mvn/version "1.24.3"}
Thanks to all who raised issues, contributed or otherwise assisted with this release.
Cheers, XT Team
XTDB 1.24.1 is now available!
This minor release fixes a few outstanding bugs and introduces some debug logging in the Lucene module to help catch other bugs. 🐞
;; lein
[com.xtdb/xtdb-core "1.24.1"]
;; deps.edn
com.xtdb/xtdb-core {:mvn/version "1.24.1"}
Thanks to all who raised issues, contributed, or otherwise assisted with this release.
Cheers, XT Team
XTDB 1.24.0 is now available! This release has a particular focus on the use of the checkpointer module and checkpoint stores within XTDB. In particular:
;; lein
[com.xtdb/xtdb-core "1.24.0"]
;; deps.edn
com.xtdb/xtdb-core {:mvn/version "1.24.0"}
The XTDB checkpointer works on a schedule, creating new checkpoints based on the configured approx-frequency
. Prior to 1.24.0
, the checkpointer would always make a new checkpoint so long as at least approx-frequency
time had passed, regardless of whether anything had changed or not. I.e. even if the node had not processed any new transactions since the last checkpoint, it would still create a checkpoint. As such, there could be multiple checkpoints made for the same transaction.
As of 1.24.0
, when the checkpointer goes to make a checkpoint, it will now check for changes prior to making a new checkpoint. This is based on the transaction-id
of last processed transaction by the index store. This ensures that every transaction will only ever have a single checkpoint, and we don't unncessarily make a number of identical checkpoints.
WARNING: To users with external lifecycle policies set against your checkpoints - if no new transactions are handled for a period, no new checkpoints will be made, and you may risk having no checkpoints and having to replay from the Transaction Log if your lifecycle policies are entirely time based!
To help supplement the above functionality, we've implemented a configurable retention-policy
within the checkpointer module, such that the checkpointer can deal with the lifecycle of checkpoints for you.
As of 1.24.0
, we can now provide a new parameter, retention-policy
, when configuring the checkpointer module. Doing so will make the checkpointer handle the deletion & retention behaviour of old checkpoints.
The config for this will look like the following:
{
...
:checkpointer {:xtdb/module xtdb.checkpoint/->checkpointer
:store {...}
:approx-frequency "PT6H"
:retention-policy {:retain-newer-than "PT7D"
:retain-at-least 5}}
...
}
When passing in retention-policy
, we need to provide at least one of retain-newer-than
and retain-at-least
- though you can provide both. What follows is the behaviour of the checkpointer at the point after we’ve completed making a new checkpoint:
retain-at-least
is provided, we will take the list of available checkpoints, keep the latest retain-at-least
checkpoints, and delete the rest.retain-newer-than
is provided, we will keep all checkpoints newer than the configured Duration and delete all checkpoints older than the configured Duration.retain-at-least
and retain-newer-than
are provided:
retain-at-least
values will be considered "safe", and will always be kept.retain-newer-than
and delete all of the remaining checkpoints that are older.
NOTE: When using this, ensure that the checkpoint store you are using has the permissions to be able to delete checkpoints. This is similarly necessary if it ever has to cleanup a failed checkpoint!
NOTE: We would recommend disabling any external lifecycle policies on the objects if you decide to use the above. We would also recommend the user to consider the impact of checking for changes, and to always aim to retain-at-least
a single checkpoint at a minimum.
Within 1.24.0
, we've made a number of changes to the Azure Blobs module. These changes are breaking to those currently using at as their document store in that they will change how the module is configured and how it is Authenticated against. Main changes to mention:
sas-token
- this is no longer available, so you will need to authenticate with the above. For examples of configuration for both the document store and checkpoint store, see the documentation for the module.
Similarly to the above, within 1.24.0
we have made a number of changes to the implementation of the Google Cloud Storage module. Previously, both the document store and checkpoint store were implemented in terms of the Java NIO filesystem, and were configured using paths to the bucket. Within this release these are now implemented in terms of the Google Cloud Storage Java API, and their configuration options have changed. They are both authenticated via Google's "Application Default Credentials" - see the https://github.com/googleapis/google-auth-library-java/blob/main/README.md#application-default-credentials[relevant documentation] to get set up.
For more information on how to configure and use the module, see the documentation.
cleanup-checkpoint
behaviour of the filesystem-checkpoint-store
with remote filesystems.save-checkpoint
function to throw FileAlreadyExists
.XTDB 1.23.3 is now available!
Alongside bug fixes, this release contains a schema change to widen the event_offset
id column used for JDBC tx logs and document stores (see below migration advice for more details).
;; lein
[com.xtdb/xtdb-core "1.23.3"]
;; deps.edn
com.xtdb/xtdb-core {:mvn/version "1.23.3"}
event_offset
column schema widened to 64bits (see below migration advice)or
branch analysisor
branch analysisawait-tx
is called and ensures later queries are as-of at least that transaction time. This solves a problem in load balanced environments where queries are not guaranteed to reach the same XTDB nodes as await-tx
calls.Thanks to all who raised issues, contributed, or otherwise assisted with this release.
Cheers, XT Team
A primary key migration may be necessary for JDBC document stores and transaction logs created before XTDB 1.23.3
. In these cases, the event_offset
column is limited to 32 bits, using INT
for MySQL and serial
(mapped to integer
) for Postgres.
If databases insert enough transactions or documents to exceed the maximum value of 2147483647
for this column, XTDB nodes will no longer accept writes.
New deployments using XTDB versions since 1.23.3
utilize 64 bits for the event_offset
column, increasing the maximum number of transactions and documents to 9223372036854775807
.
Below, you'll find migration guidance for users who want to opt existing installations into the increased maximum.
For MySQL or MariaDB installations, the following DDL statement can be used to migrate your transaction log or document store to use 64-bit offsets.
Important This statement should be executed during a maintenance cycle and not on a running system. Please refer to the note below for more information.
ALTER TABLE tx_events
MODIFY COLUMN event_offset BIGINT AUTO_INCREMENT NOT NULL, ALGORITHM=COPY,
Note: Due to the need to use the
COPY
algorithm, running this statement on a busy system is not recommended as it may cause a service outage. Thetx_events
table will be copied, blocking both readers and writers. For more information, refer to the MySQL online DDL documentation.
For Postgres installations, the following DDL transaction can be used to migrate your transaction log or document store to use 64-bit offsets.
Important: This transaction should be executed during a maintenance cycle and not on a running system. Please refer to the note below for more information.
BEGIN;
LOCK TABLE tx_events NOWAIT;
ALTER TABLE tx_events ALTER COLUMN event_offset SET DATA TYPE BIGINT;
ALTER SEQUENCE tx_events_event_offset_seq AS BIGINT MAXVALUE 9223372036854775807;
END;
Note: Since a full table lock is required, running this transaction on a busy system is not recommended. It is unlikely that the lock will be obtained (the lock statement will fail early due to
NOWAIT
). If the lock is obtained, both readers and writers will be blocked while the table is rebuilt.
If you are using the JDBC module with databases other than Postgres or MySQL/MariaDB, please get in touch to discuss your migration options with us first and we can produce additional guidance for you: [email protected]
XTDB 1.23.2 is now available!
This release restores support for Clojure 1.10.x, alongside bug fixes to transaction processing and S3 Transfer Manager checkpointing.
;; lein
[com.xtdb/xtdb-core "1.23.2"]
;; deps.edn
com.xtdb/xtdb-core {:mvn/version "1.23.2"}
OutOfMemoryError
crashes during transaction processing not halting the ingester correctly if spec asserts are enabled.ClassCastException
thrown instead of the correct IllegalArgumentException
for non empty checkpoint directories (S3 Transfer Manager):prefix
configuration option not applying to S3 Transfer Manager downloads.:as-alias
that caused a dependency on Clojure 1.11.xThanks to all who raised issues, contributed, or otherwise assisted with this release.
Cheers, XT Team
XTDB 1.23.1 is now available!
In addition to bug fixes, this release provides a new way to manage checkpoints in S3 with the the AWS S3 Transfer Manager.
Changes have also been made to the ingesters error and recovery policy to improve reliability in the face of document storage failures.
;; lein
[com.xtdb/xtdb-core "1.23.1"]
;; deps.edn
com.xtdb/xtdb-core {:mvn/version "1.23.1"}
Using the Transfer Manager should increase reliability and transfer speed when XTDB copies checkpoints to and from S3.
See the AWS S3 Storage documentation for more information on how to configure and use this option.
Thanks @jsulmont!
1.23.1 will now retry document storage fetches that fail due to an unexpected exception, rather than halting ingestion. You should expect your XTDB nodes to require fewer restarts due to network fault, busy storage or outage.
On encountering a fetch exception during transaction processing, XTDB will enter an exponential backoff retry loop, bounded at a 1min back off. If the node is closed, the loop will be interrupted and the transaction will be retried when the node is started.
This behaviour extends to transaction functions that use the document store in query, such as via pull
or entity
calls.
It is possible for the document store to provide in the ex-data of thrown exceptions a :cognitect.anomalies/category
key to categorize exceptions to halt the ingester instead (for example if recovery is not possible).
:ingester-failed?
to map returned by xt/status
, so halting is observable by programs. Details can still be found in the logs..close
the XTDB node while the transaction function is blocking on IO.:port
spec. @jsulmont:xt/id
is a java.net.URI
. @jsulmontAgent/pooledExecutor
no longer used for stat maintenance, JVM exit no longer blocked if (shutdown-agents)
is not used..close
that use a native (LMDB/RocksDB) kv-store should now throw rather than cause a segfault.Thanks to all who raised issues, contributed, or otherwise assisted with this release.
Cheers, XT Team
XTDB 1.23.0 is now available!
This release provides substantial write throughput improvements for many workloads, with a particular focus on RocksDB index storage.
There is a minor breaking change to an internal protocol that may affect TxLog module authors, hence 1.23.0 instead of 1.22.2 (see new kafka poll timeout option in the notes below).
;; lein
[com.xtdb/xtdb-core "1.23.0"]
;; deps.edn
com.xtdb/xtdb-core {:mvn/version "1.23.0"}
We've had a good look at indexing performance over OLTP and analytic workloads, with a particular focus on RocksDB. RocksDB users can expect anywhere between 15% to 200% additional write throughput depending on their workload and hardware configuration. On average, we expect most users to see around a 100% increase if they are using RocksDB on multi-core machines with underutilized CPU cores.
Auctionmark OLTP benchmark on an
m5.2xlarge
TPC-H SF 1.0 indexing trace on an
m5.2xlarge
Please share your results with the team on slack or any of our other channels!
A new option is available on RocksDB KV stores to enable prefix filtering of bitemporal seeks, which can eliminate redundant IO when writing an entity for the first time. To enable this feature, use the :enable-filters?
configuration option on your index-store :kv-store
.
{:xtdb/index-store {:kv-store {:module 'xtdb.rocksdb, :enable-filters? true, ...}}, ...}
Prefix filters will require new SST files to be created during indexing, new deployments will see an improvement straight away. Existing deployments will not see immediate improvements.
Enabling filtering increases memory pressure on the block cache. If you use this feature, we recommend a block cache size of at least 256MB. See docs for guidance on configuring the cache.
Kafka TxLog users can now supply an additional option :kafka/poll-wait-duration
to override the default 1 second timeout when polling for kafka messages.
See docs for more information.
Thanks @jsulmont!
This change introduces a required options map parameter to (open-tx-log) on the TxLog protocol. This should not break user programs unless they are using a custom TxLog module, in which case the TxLog implementation will need to be extended to the new arity.
not
rule no longer throws a NullPointerException
if it contains unknown variables.Thanks to all who raised issues, contributed, or otherwise assisted with this release. Particular thanks to @tatut for the fantastic assistance in debugging a difficult-to-pin-down statistics issue!
Cheers, XT Team
1.22.1 is now available. This is a non-breaking release, it contains new valid-time predicates and a few bug fixes.
;; lein
[com.xtdb/xtdb-core "1.22.1"]
;; deps.edn
com.xtdb/xtdb-core {:mvn/version "1.22.1"}
#1811 Adds the get-start-valid-time
, get-end-valid-time
query predicates.
These are useful to work with the start/end valid time values of an entity during query.
Here is an example illustrating how you might use these predicates to ask new kinds of temporal question with datalog:
(with-open [node (xt/start-node {})]
;; bob likes pizza on the 27th, claire likes cake.
(->> [[::xt/put {:xt/id "bob", :favorite "pizza"} #inst "2022-09-27"]
[::xt/put {:xt/id "claire", :favorite "cake"} #inst "2022-09-27"]]
(xt/submit-tx node))
;; bob will change his mind about his favorite food on the 28th.
(->> [[::xt/put {:xt/id "bob", :favorite "pasta"} #inst "2022-09-28"]]
(xt/submit-tx node))
(xt/sync node)
;; on the 27th, what was everyone's favorite food
;; and when will they change their mind?
(xt/q
(xt/db node {::xt/valid-time #inst "2022-09-27"})
'{:find [?e, ?food, ?until]
:where [[?e :favorite ?food]
[(get-end-valid-time ?e) ?until]]}))
thanks @FiV0!
Cheers, XT Team
drum roll ... 1.22.0 is out!
1.22.0 represents various major upstream dependency upgrades, sweeping improvements to ingestion performance, and unlocks another level of eagerly-awaited native experience for our users running on M1 processors (where previously LMDB and Kafka were missing support). The release also contains a number of important bugfixes.
;; project.clj, e.g.
[com.xtdb/xtdb-core "1.22.0"]
;; deps.edn, e.g.
{com.xtdb/xtdb-core {:mvn/version "1.22.0"}}
The upgrades to both RocksDB and LMDB require an 'index version bump'. This means that you'll need to clear your XTDB query indices, and re-index from the transaction log as per the docs. The new index version is 22 also (a happy coincidence!).
In green/blue production settings, we recommend doing this by starting a new cluster of XTDB nodes, waiting for them to catch up with the tx-log, switching over, and decommissioning the old nodes. If you've got any questions/concerns about this, please do get in touch via [email protected] - we're happy to help out.
For those who are interested in the main technical changes here, we have:
WriteBatchWithIndex
API, in the cases where RocksDB is used for the index-store #1762
seek
overhead because there is less scanning over temporarily unsorted data #1773
In our own testing and evaluation this work has improved bulk ingestion by as much as 40% (e.g. a 10 hour import job is now 6 hours). The benefit is also experienced during re-indexing (e.g. when upgrading from 1.21.0
to 1.22.0
). These changes also open up several new avenues for further ingestion performance work in future.
@wotbrew joined the development team earlier in the year and has successfully conquered the remaining M1 issues. Maintaining end-to-end support will remain a priority for us looking ahead.
As ever, a big thanks to everyone contributing to this release by raising/fixing issues, and helping us with repros!
Cheers,
XT Team
/cc 1.22.0 Contributors: @sw1nn @nivekuil @theronic @underdarknl @Lisser @FiV0 @jonpither @bowbahdoe @ferdinand-beyer @hukka @alexisvincent @zirota @jsulmont @kevinmershon @refset @jarohen (...not exhaustive :slightly_smiling_face:)
Here's 1.21.0!
1.21.0 introduces 'projection/aggregate expressions', and also contains a number of performance improvements and bugfixes.
;; project.clj, e.g.
[com.xtdb/xtdb-core "1.21.0"]
;; deps.edn, e.g.
{com.xtdb/xtdb-core {:mvn/version "1.21.0"}}
A couple of the fixes below require an 'index version bump'. This means that you'll need to clear your XTDB query indices, and re-index from the transaction log as per the docs. The new index version is 20
.
In green/blue production settings, we recommend doing this by starting a new cluster of XTDB nodes, waiting for them to catch up with the tx-log, switching over, and decommissioning the old nodes. If you've got any questions/concerns about this, please do get in touch via [email protected] - we're happy to help out.
We have added support for simple Clojure expressions in :find
projections and aggregations. You should be able to remove :where
clauses that were only included to generate intermediate values, and move these to the :find
clause instead.
For example, the change to TPC-H Q1 looks like this:
;; before
'{:find [l_returnflag
l_linestatus
(sum ret_2)
(sum ret_4)
...]
:where [[l :l_shipdate l_shipdate]
[(<= l_shipdate #inst "1998-09-02")]
[l :l_extendedprice l_extendedprice]
[l :l_discount l_discount]
[l :l_tax l_tax]
[l :l_returnflag l_returnflag]
[l :l_linestatus l_linestatus]
[(- 1 l_discount) ret_1]
[(* l_extendedprice ret_1) ret_2]
[(+ 1 l_tax) ret_3]
[(* ret_2 ret_3) ret_4]
...]
...}
;; after
'{:find [l_returnflag
l_linestatus
(sum (* l_extendedprice (- 1 l_discount)))
(sum (* (* l_extendedprice (- 1 l_discount))
(+ 1 l_tax)))
...]
:where [[l :l_shipdate l_shipdate]
[l :l_extendedprice l_extendedprice]
[l :l_discount l_discount]
[l :l_tax l_tax]
[l :l_returnflag l_returnflag]
[l :l_linestatus l_linestatus]
[(<= l_shipdate #inst "1998-09-02")]
...]
...}
This has an accompanied performance improvement, due to simplifying the main phase of the query engine execution.
We've been spending some more time with our heads in a profiler - you should see some query performance benefits, particularly if you make heavy use of or
clauses and sub-queries.
get-attr
, seems to save ~5% on TPC-H Q1, which has a handful of :where
triple clauses internally converted to get-attr
calls.or
clauses.Long requested, now available!
submit-tx
and friends now take an options map, for now only accepting a ::xt/tx-time
key to override the tx-time of the transaction.
This means you can import transaction times from external sources (for example, when bulk loading existing bitemporal data into XT). The transaction time mustn't be earlier than any other transaction currently in XTDB (i.e. make sure you import the transactions in order) and it mustn't be later than the current time on the transaction log (e.g. Kafka).
RocksJava has taken a while to add support M1 chips, but the latest builds have finally landed in Maven :tada:
SERIAL
/ MySQL auto_increment
.or
/ or-join
clauses.:timeout
mvn-uberjar
initialises correctly.match
, then subsequently put
.pull-many
remote API client implementation.pull-many
returns results in the same order and cardinality as the input entity idsAs always, a big thanks to everyone contributing to this release by raising/fixing issues, and helping us with repros!
Cheers,
XT Team