Canton Versions Save

Global Workflow Composition that is Scalable, Secure, and GDPR-compliant

v2.8.5

3 weeks ago

Release of Canton 2.8.5

Canton 2.8.5 has been released on April 26, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory. Please also consult the full documentation of this release.

Summary

This is a maintenance release, containing a bugfix. Users are advised to upgrade during their maintenance window.

Bugfixes

(24-008, Major): Deadlock in Topology Dispatcher

Issue Description

When a topology change is sent to the sequencer and there is a problem with the transmission or sequencing, the topology dispatcher waits forever to receive back the corresponding topology transaction. Later topology changes queue behind this outstanding topology change until the node is restarted.

Affected Deployments

Participant, Domain and Domain Topology Manager nodes

Affected Versions

All 2.3-2.7 2.8.0-2.8.4

Impact

The node cannot issue topology changes anymore.

Symptom

Varies with the affected topology transaction: Party allocations time out, uploaded packages cannot be used in transactions (NO_DOMAIN_FOR_SUBMISSION), cryptographic keys can not be changed, etc.

The failure of sequencing can be seen in DEBUG logging on the node:

DEBUG c.d.c.t.StoreBasedDomainOutbox:... tid:1da8f7fff488dad2fc4c9c0177633a7e - Attempting to push .. topology transactions to Domain ...
DEBUG c.d.c.s.c.t.GrpcSequencerClientTransport:... tid:1da8f7fff488dad2fc4c9c0177633a7e - Sending request send-async-versioned/f77dd135-9c6a-4bd8-a6ed-a9a9f3ef43ca to sequencer.
DEBUG c.d.c.s.c.SendTracker:... tid:1da8f7fff488dad2fc4c9c0177633a7e - Sequencer send [f77dd135-9c6a-4bd8-a6ed-a9a9f3ef43ca] has timed out at ...

where the last log line by default comes 5 minutes after the first two.

Workaround

Restart the node.

Likeliness

Occurs for unstable network conditions between the node and the sequencer (e.g., frequent termination of a subset of the connections by firewalls) and when the sequencer silently drops submission requests.

Recommendation

Upgrade to 2.8.5

Compatibility

The following Canton protocol versions are supported:

Dependency	Version
Canton protocol versions	3, 4, 5

Canton has been tested against the following versions of its dependencies:

Dependency	Version
Java Runtime	OpenJDK 64-Bit Server VM Zulu11.70+15-CA (build 11.0.22+7-LTS, mixed mode)
Postgres	Recommended: PostgreSQL 12.18 (Debian 12.18-1.pgdg120+2) – Also tested: PostgreSQL 11.16 (Debian 11.16-1.pgdg90+1), PostgreSQL 13.14 (Debian 13.14-1.pgdg120+2), PostgreSQL 14.11 (Debian 14.11-1.pgdg120+2), PostgreSQL 15.6 (Debian 15.6-1.pgdg120+2)
Oracle	19.20.0

v2.8.4

1 month ago

Release of Canton 2.8.4

Canton 2.8.4 has been released on April 16, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory. Please also consult the full documentation of this release.

Summary

This is a maintenance release containing small improvements and bug fixes.

What’s New

Node's Exit on Fatal Failures

When a node encounters a fatal failure that Canton unexpectedly cannot handle gracefully, the new default behavior is that the node will exit/stop the process and relies on an external process or service monitor to restart the node's process. Without this exit, the node would remain in a half-broken state, requiring a manual restart.

Failures that are currently logged as an error but not automatically recovered from are:

Unhandled exceptions when processing events from a domain, which currently leads to a disconnect from that domain.
Failed transition from an active replica to a passive replica, which may result in an invalid state of the node.

The new behavior can be reverted by setting: canton.parameters.exit-on-fatal-failures = false in the configuration, but we encourage users to adopt the new behaviour.

Minor Improvements

Error code changes

When an access token expires and ledger api stream is terminated an ABORTED(ACCESS_TOKEN_EXPIRED) error is returned.

Fixed error cause truncation

Before, error causes of the documented errors were truncated after 512 characters. This behaviour is necessary when transporting errors through GRPC, as GRPC imposes size limits, but the limit was also applied to errors in logs, which caused relevant information to be truncated. Now, the 512 bytes limit is not applied to the errors written to the logs anymore.

Configuration Changes

Token Expiry Grace Period for Streams

When a token used in the ledger api request to open a stream expires, the stream is terminated. This normally happens several minutes or hours after the stream initiation. Users can now configure a grace period that will protect the stream termination beyond the token expiry:

   canton.participants.participant1.parameters.ledger-api-server-parameters.token-expiry-grace-period-for-streams=600.seconds

Grace period can be any non-negative duration where both the value and the units must be defined e.g. "600.seconds" or "10.minutes". When parameter is omitted, grace period defaults to zero. When the configured value is Inf the stream is never terminated.

(24-007, Major): Domain reconnect does not complete timely during participant failover

Issue Description

During participant failover the newly active participant does not complete reconnecting to the domain and fail to declare itself as active in a timely manner. This can happen when the passive replica, which has become active, has been idle for a longer time while there has been many transactions processed by the former active replica.

Affected Deployments

Participant

Affected Versions

2.3-2.6, 2.7.0-2.7.7 2.8.0-2.8.3

Impact

During participant failover the newly active participant does not become active and does not serve commands and transactions in a timely manner.

Symptom

"One of the last log entries before the participant only outputs storage health checks anymore is like:

INFO c.d.c.p.s.d.DbMultiDomainEventLog:participant=participant1 Fetch unpublished from domain::122345... up to Some(1234)

The participant reports itself as not active and not connected to any domains in its health status."

Workaround

Restart the participant node.

Likeliness

It depends on how long the passive replica has been idle, the number and size of transactions that the active replica has processed in the meantime. These factors influence how timely the failover and in particular reconnect to the domain completes.

Recommendation

Upgrade to 2.8.4

Compatibility

The following Canton protocol versions are supported:

Dependency	Version
Canton protocol versions	3, 4, 5

Canton has been tested against the following versions of its dependencies:

Dependency	Version
Java Runtime	OpenJDK 64-Bit Server VM Zulu11.70+15-CA (build 11.0.22+7-LTS, mixed mode)
Postgres	Recommended: PostgreSQL 12.18 (Debian 12.18-1.pgdg120+2) – Also tested: PostgreSQL 11.16 (Debian 11.16-1.pgdg90+1), PostgreSQL 13.14 (Debian 13.14-1.pgdg120+2), PostgreSQL 14.11 (Debian 14.11-1.pgdg120+2), PostgreSQL 15.6 (Debian 15.6-1.pgdg120+2)
Oracle	19.20.0

v2.3.19

1 month ago

Release of Canton 2.3.19

Canton 2.3.19 has been released on March 28, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory. Please also consult the full documentation of this release.

Summary

Canton 2.3.19 has been released as part of Daml 2.3.19 with no additional change on Canton

Compatibility

The following Canton protocol and Ethereum sequencer contract versions are supported:

Dependency	Version
Canton protocol versions	2.0.0, 3.0.0
Ethereum contract versions	1.0.0, 1.0.1

Canton has been tested against the following versions of its dependencies:

Dependency	Version
Java Runtime	OpenJDK 64-Bit Server VM 18.9 (build 11.0.16+8, mixed mode, sharing)
Postgres	postgres (PostgreSQL) 14.11 (Debian 14.11-1.pgdg120+2)
Oracle	19.15.0
Besu	besu/v21.10.9/linux-x86_64/openjdk-java-11
Fabric	2.2.2

v2.7.9

1 month ago

Release of Canton 2.7.9

Canton 2.7.9 has been released on March 20, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory. Please also consult the full documentation of this release.

Summary

This is a maintenance release, providing a workaround for a participant crash recovery issue on protocol version 4.

Bugfixes

This release provides a workaround for a specific participant crash recovery issue under load on protocol version 4 triggered by the bug 23-021 (fixed in protocol version 5). The workaround can be enabled for a participant through the following configuration, but should only be done when advised so by Digital Asset: canton.participants.XXX.parameters.disable-duplicate-contract-check = yes.

Compatibility

The following Canton protocol and Ethereum sequencer contract versions are supported:

Dependency	Version
Canton protocol versions	3, 4, 5

Canton has been tested against the following versions of its dependencies:

Dependency	Version
Java Runtime	OpenJDK 64-Bit Server VM Zulu11.70+15-CA (build 11.0.22+7-LTS, mixed mode)
Postgres	postgres (PostgreSQL) 14.11 (Debian 14.11-1.pgdg120+2)
Oracle	19.18.0

v2.3.18

2 months ago

Release of Canton 2.3.18

Canton 2.3.18 has been released on March 15, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory. Please also consult the full documentation of this release.

Summary

Canton 2.3.18 has been released as part of Daml 2.3.18 with no additional change on Canton.

Compatibility

The following Canton protocol and Ethereum sequencer contract versions are supported:

Dependency	Version
Canton protocol versions	2.0.0, 3.0.0
Ethereum contract versions	1.0.0, 1.0.1

Canton has been tested against the following versions of its dependencies:

Dependency	Version
Java Runtime	OpenJDK 64-Bit Server VM 18.9 (build 11.0.16+8, mixed mode, sharing)
Postgres	postgres (PostgreSQL) 14.11 (Debian 14.11-1.pgdg120+2)
Oracle	19.15.0
Besu	besu/v21.10.9/linux-x86_64/openjdk-java-11
Fabric	2.2.2

v2.7.8

2 months ago

v2.8.3

2 months ago

Release of Canton 2.8.3

Canton 2.8.3 has been released on February 21, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory. Please also consult the full documentation of this release.

Summary

This is a maintenance release, fixing critical issue which can occur if overly large transactions are submitted to the participant. The bootstrap domain command has been slightly improved, but it may now throw an error if you try to bootstrap an already initialized domain while the domain manager node is still starting up.

Console changes

Console commands that allow to download an ACS snapshot now take a new mandatory argument to indicate whether the snapshot will be used in the context of a party offboarding (party replication or not). This allows Canton to performance additional checks and makes party offboarding safer.

Affected console commands:

participant.repair.export_acs
participant.repair_download_acs (deprecated method)

New argument: partiesOffboarding: Boolean.

Bugfixes

(24-003, Moderate): Cannot exercise keyed contract after party replication

Issue Description

When replicating a party who is a maintainer on a contract key, the contract cannot be exercised anymore

Affected Deployments

Participant

Affected Versions

2.7
2.8.0-2.8.1

Impact

Contract become unusable

Symptom

The following error is emitted when trying to exercise a choice on the contract:

java.lang.IllegalStateException:
Unknown keys are to be reassigned. Either the persisted ledger state corrupted
or this is a malformed transaction. Unknown keys:

Workaround

None.

Likeliness

Deterministic.

Recommendation

Do not use the party migration macros with contract keys in version 2.8.1. Upgrade to 2.8.2 if you want to use them.

(24-002, Critical): Looping DB errors for transactions with many (32k) key updates

Issue Description

If a transaction with a very large number of key updates is submitted to the participant, the SQL query to update the contract keys table will fail, leading to a database retry loop, stopping transaction processing. The issue is caused by a limit of 32k rows in a prepared statement in the JDBC Postgres driver.

Affected Deployments

Participant on Postgres

Affected Versions

2.3-2.6, 2.7.0-2.7.6 2.8.0-2.8.1

Impact

Transactions can be submitted to the participant, but the transaction stream stops emitting transactions and commands don't appear to succeed.

Symptom

A stuck participant, continuously logging about "Now retrying operation 'com.digitalasset.canton.participant.store.db.DbContractKeyJournal.addKeyStateUpdates'" and "Tried to send an out-of-range integer as a 2-byte value: 40225"

Workaround

Upgrade to a version containing the fix. Alternatively, such a transaction may be ignored, but this must be done with care to avoid a ledger fork in case of multiple involved participants. Contact support.

Likeliness

Deterministic with very large (32k key updates) transactions.

Recommendation

Upgrade at your convenience or if experiencing the error.

(24-004, Major): DB write operations fail due to read-only connections as part of a DB failover

Issue Description

Given a Postgres HA setup with a write and read replica using AWS RDS. During DB failover, Canton connects and remains connected to the former write replica, which results in the DB connections being read-only. This fails all write DB operations on which we retry indefinitely.

Affected Deployments

All nodes with Postgres HA. Only observed with AWS RDS so far, not on Azure.

Affected Versions

All 2.3-2.6 2.7.0-2.7.6 2.8.0-2.8.1

Impact

A node becomes unusable due to all DB write operations failing and being retried indefinitely.

Symptom

The following kind of exceptions are logged: org.postgresql.util.PSQLException: ERROR: cannot execute UPDATE in a read-only transaction

Workaround

Restart the node to reconnect to the current write replica. Set DNS TTL in JVM networkaddress.cache.ttl=0 reduces the likelihood of this problem, but disables DNS caching entirely

Likeliness

Probable in case of DB failover, only observed in AWS RDS so far.

Recommendation

Upgrade to 2.8.2 if you are using AWS RDS

(24-005, Major): Race condition during domain manager startup may prevent startup after upgrade

Issue Description

A race condition in the domain bootstrap macro and domain manager node initialisation may result in a domain manager overriding the previously stored domain parameters. This can cause the domain manager to be unable to reconnect to the domain after upgrading to version 2.7 or beyond without a hard domain migration to pv=5, as with 2.7, the default protocol version changed to pv=5.

Affected Deployments

Domain Manager Nodes

Impact

The domain manager node is unable to connect to the sequencer. Topology transactions such as party additions have no effect on the domain, as they are not verified and forwarded by the domain manager.

Symptom

Parties added to the participant do not appear on the domain. Transactions referring to such parties are rejected with UNKNOWN_INFORMEE. The domain manager logs complain about "The unversioned subscribe endpoints must be used with protocol version 4"

Workaround

If you don't run bootstrap_domain after the domain was already initialised, the issue won't happen. If a node is affected, you can recover the node explicitly by seting the protocol version to v=4 in the configuration and use the reset-stored-static-config parameter to correct the wrongly stored parameters.

Likeliness

Race condition which can happen if you run the bootstrap_domain command repeatedly on an already bootstrapped domain during startup of the domain manager node. As there is no need to run the bootstrap_domain command repeatedly, this issue can easily be avoided.

Recommendation

Do not run the bootstrap_domain command after the domain was initialised. A fix has been added to prevent future mistakes.

(24-006, Minor): Large error returned from ACS query cannot be stuffed into HTTP2 headers

Issue Description

A bad ledger api client request results in an error that contains a large metadata resource list and fails the serialization in the netty layer. As a consequence what is sent to the client is an INTERNAL error that doesn't correspond to the actual problem.

Affected Deployments

Participant Nodes

Impact

Incorrect error returned to the ledger client which may complicate trouble-shooting

Symptom

Ledger client observes following error:

INTERNAL: RST_STREAM closed stream. HTTP/2 error code: PROTOCOL_ERROR

At the same time the participant node reports following error:

io.grpc.netty.NettyServerHandler - Stream Error
io.netty.handler.codec.http2.Http2Exception$HeaderListSizeException: Header size exceeded max allowed size (8192)

Workaround

Rely on participant logs when investigating ledger client issues with logs containing the message

INTERNAL: RST_STREAM closed stream. HTTP/2 error code: PROTOCOL_ERROR

Likeliness

It is rather difficult for ledger api clients to create bad requests that will end up in errors with long list of resources. One example is when client requests a long list of non existing templates in a filter of a ACS ledger api request. There was an unrelated bug in the trigger service that could produce long list of templates when AllInDar construction was used in the trigger installation invocation.

Recommendation

Upgrade at your convenience.

Compatibility

The following Canton protocol versions are supported:

Dependency	Version
Canton protocol versions	3, 4, 5

Canton has been tested against the following versions of its dependencies:

Dependency	Version
Java Runtime	OpenJDK 64-Bit Server VM Zulu11.66+15-CA (build 11.0.20+8-LTS, mixed mode)
Postgres	Recommended: PostgreSQL 12.18 (Debian 12.18-1.pgdg120+2) – Also tested: PostgreSQL 11.16 (Debian 11.16-1.pgdg90+1), PostgreSQL 13.14 (Debian 13.14-1.pgdg120+2), PostgreSQL 14.11 (Debian 14.11-1.pgdg120+2), PostgreSQL 15.6 (Debian 15.6-1.pgdg120+2)
Oracle	19.20.0

v2.7.7

2 months ago

v2.8.1

3 months ago

Release of Canton 2.8.1

Canton 2.8.1 has been released on January 31, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory. Please also consult the full documentation of this release.

Summary

This is a maintenance release of Canton including reliability fixes and user experience improvements. Users of the Fabric driver are encouraged to upgrade at their disposal.

What’s New

Minor Improvements

Improved reference configuration example

We have reworked the reference configuration example. The examples/03_advanced_configuration example has been replaced by a set of reference configuration files which can be found in the config directory. While the previous example contained pieces which could be assembled into a full configuration, the new examples now contain the full configuration which can be simplified by removing parts which are not needed. The installation guide <https://docs.daml.com/canton/usermanual/installation.html>_ has been updated accordingly.

Improved Party Replication Macros

The enterprise version now supports replicating a party from one participant node to another for either migration or to have multiple participants hosting the same party. Please consult the documentation <https://docs.daml.com/2.9.0/canton/usermanual/identity_management.html#replicate-party-to-another-participant-node>_ on how to use this feature.

Reduced Background Journal Cleaning Load

We have improved the background journal cleaning to produce less load on the database by using smaller transactions to clean up the journal.

Executor Service Metrics removed

The metrics for the execution services have been removed:

daml.executor.runtime.completed*
daml.executor.runtime.duration*
daml.executor.runtime.idle*
daml.executor.runtime.running*
daml.executor.runtime.submitted*
daml_executor_pool_size
daml_executor_pool_core
daml_executor_pool_max
daml_executor_pool_largest
daml_executor_threads_active
daml_executor_threads_running
daml_executor_tasks_queued
daml_executor_tasks_executing_queued
daml_executor_tasks_stolen
daml_executor_tasks_submitted
daml_executor_tasks_completed
daml_executor_tasks_queue_remaining

Bugfixes

(24-001, Major): Fabric Block Sequencer may deadlock when catching up

Issue Description

Fabric Ledger block processing runs into an unintended shutdown and fails to process blocks when the in-memory blocks exceeds 5000. This is caused when catching up after downtime and the Fabric Ledger size has increased substantially in the meantime.

Affected Deployments

Fabric Sequencer Node

Affected Versions

2.3 - 2.7 and 2.8.0

Impact

The sequencer stops feeding new blocks.

Symptom

Participant gets stuck in an old state and does not visibly catch up against a Fabric Ledger. Therefore, a domains.reconnect call on the participant may appear as if it is hanging.

On the sequencer node, each processed message is logged using Observed Send with messageId including a timestamp. Once the emission of these log lines stops, the sequencer is stuck.

Workaround

Restart the sequencer node if it is stuck.

Likeliness

Rarely, only occurs when catching up to a Fabric Ledger that the sequencer node has been an active part of before.

Recommendation

Upgrade to 2.8.1 at your convenience.

Compatibility

The following Canton protocol versions are supported:

Dependency	Version
Canton protocol versions	3, 4, 5

Canton has been tested against the following versions of its dependencies:

Dependency	Version
Java Runtime	OpenJDK 64-Bit Server VM Zulu11.66+15-CA (build 11.0.20+8-LTS, mixed mode)
Postgres	Recommended: PostgreSQL 12.17 (Debian 12.17-1.pgdg120+1) – Also tested: PostgreSQL 11.16 (Debian 11.16-1.pgdg90+1), PostgreSQL 13.13 (Debian 13.13-1.pgdg120+1), PostgreSQL 14.10 (Debian 14.10-1.pgdg120+1), PostgreSQL 15.5 (Debian 15.5-1.pgdg120+1)
Oracle	19.20.0

v2.8.0

5 months ago

Release of Canton 2.8.0

Canton 2.8.0 has been released on December 15, 2023. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory. Please also consult the full documentation of this release.

Summary

We are excited to announce Canton 2.8.0, which offers some great additional features:

distributed tracing enhancements
event query service
many operational, quality, security and performance improvements.

See below for details.

Please upgrade to this version as soon as possible. Version 2.7 will not be maintained after March 2024.

What’s New

Explicit contract disclosure is now available in Beta

Background

Explicit contract disclosure allows you to delegate contract read rights to a non-stakeholder using off-ledger data distribution.

The feature is now available in Beta and enabled by default. or more information, see the documentation.

Specific Changes

The participant.ledger-api.explicit-disclosure-unsafe feature flag is replaced by participant.ledger-api.enable-explicit-disclosure which is by default set to true. To disable the feature, set the new flag to false.

Distributed Tracing Enhancements

Background

Distributed tracing is a technique for troubleshooting performance issues in a microservices environment like Daml Enterprise. Canton supports distributed tracing, and the support on the Ledger API is improving. In this release, the client applications gain the ability to extract trace and span IDs from past transactions and completions, so that distributed traces can continue in follow-up commands.

To learn how to extend your application to support distributed tracing, see the documentation.

Specific Changes

Trace contexts are now included in the gRPC messages return in Ledger API streams and point-wise queries. This change affects the following transaction and command completion service calls:

TransactionService.GetTransactions
TransactionService.GetTransactionTrees
TransactionService.GetTransactionByEventId
TransactionService.GetTransactionById
TransactionService.GetFlatTransactionByEventId
TransactionService.GetFlatTransactionById
CompletionService.CompletionStream

Trace context enables client applications that were not the submitters of the original request to pick up the initial spans and continue them in follow-up requests. This allows multi-step workflows to be adorned with contiguous chains of related spans.

Impact and Migration

This is a purely additive change.

Event Query Service is now available as Beta

Background

Use the event query service to obtain a party-specific view of contract events.

The gRPC API provides ledger streams to off-ledger components that maintain a queryable state. This service allows you to make simple event queries without off-ledger components like the JSON API.

Using the event query service, you can create, retrieve, and archive events associated with a contract ID or contract key. The API returns only those events where at least one of the requesting parties is a stakeholder of the contract. If the contract is still active, the archive_event is unset.

Contract keys can be used by multiple contracts over time. The latest contract events are returned first. To access earlier contract key events, use the continuation_token returned in the GetEventsByContractKeyResponse in a subsequent GetEventsByContractKeyRequest.

If no events match the request criteria or the requested events are not visible to the requesting parties, an empty structure is returned. Events associated with consumed contracts are returned until they are pruned.

Specific Changes

Retrieve contract events by contract ID via the GetEventsByContractId request.
Retrieve contract events by contract key via the GetEventsByContractKey request.

Impact and Migration

This is a purely additive change.

Automatic participant pruning support for pruning internal-only state

Background

Pruning presents a trade-off between limiting ledger storage space and being able to query ledger history far into the past. Only pruning participant-internal state strikes a balance by deleting exclusively internal state. As a result, applications can continue to query historic portions of the ledger, but internal-only pruning frees up less storage space than regular pruning. Previously, internal pruning was available only via the "manual" participant.pruning.prune_internally command. With this release, pruning participant internal-only state also becomes available through automatic pruning as well.

Specific Changes

Configure automatic, internal-only pruning using the new participant.pruning.set_participant_schedule command's prune_internally_only parameter.
Retrieve the currently active participant schedule including the prune_internally_only setting via the newly introduced participant.pruning.get_participant_schedule command.

Impact and Migration

This is a purely additive change.

Minor Changes

Logging Cleanup

We've cleaned up our transaction processing logging to make it easier to understand what is happening. Now, significant events on all nodes are logged at INFO levels and including relevant information.

Ledger API Command Submission Changes

CommandService gRPC deadline logic

Commands submitted to the Ledger API now respects the grpc deadlines: If a request reaches the command processing layer with an already-expired gRPC deadline, the command will not be sent for submission. Instead, the request is rejected with a new self-service error code REQUEST_DEADLINE_EXCEEDED, which informs the client that the command is guaranteed not to have been sent for execution to the ledger.

Command Interpretation Timeouts

If you submit a command that runs for a very long time, the Ledger API will now reject the command with the new self-service error code INTERPRETATION_TIME_EXCEEDED when the transaction would reach the ledger time tolerance limit based on the submission time.

Protocol versions 3 and 4 are deprecated

Protocol versions 3 and 4 are now marked as deprecated and will be removed in the next minor release of Canton. Protocol version 5 should be preferred for any new deployment.

Configuration Changes

Breaking: KMS wrapper-key configuration value now accepts a simple string

The expected KMS wrapper-key configuration value has changed from:

    crypto.private-key-store.encryption.wrapper-key-id = { str = "..."}

to a simple string:

    crypto.private-key-store.encryption.wrapper-key-id = "..."

Parallel Node Startup

Nodes will now start up in parallel. The startup parallelism can be configured by setting:

    canton.parameters.startup-parallelism = 1 // defaults to number of threads.

Breaking: Schema migration attempts configuration for the indexer

The configuration fields schema-migration-attempt-backoff and schema-migration-attempts for the indexer were removed. The following config lines will have to be removed, if they exist:

participants.participant.parameters.ledger-api-server-parameters.indexer.schema-migration-attempt-backoff
participants.participant.parameters.ledger-api-server-parameters.indexer.schema-migration-attempts

Breaking: Cache weight configuration for the Ledger API server

The configuration fields max-event-cache-weight and max-contract-cache-weight for the ledger api server were removed. The following config lines will have to be removed, if they exist:

participants.participant.ledger-api.max-event-cache-weight
participants.participant.ledger-api.max-contract-cache-weight

Config Logging On Startup

By default, Canton will log the config values on startup, as this has turned out to be useful for troubleshooting, in order to understand what configuration is related to the given log-file.

This feature can be turned off by setting

    canton.monitoring.logging.log-config-on-startup = false

Logging the configuration including all default values can be turned on using

    canton.monitoring.logging.log-config-with-defaults = true

Note that this will log all available settings, which includes parameters which we do not recommend changing. Confidential data will not be logged but replaced by xxxx.

Mediator Finalized Response Cache

The mediator finalized response cache can now be configured using:

    canton.mediators.mediator.caching.finalized-mediator-requests = 1000 // default

which allows you to limit memory consumption in the presence of a transaction with a very large number of views.

Default Size of Ledger API Caches

The default sizes of the contract state and contract key state caches has been decreased by one order of magnitude from 100'000 to 10'000.

   canton.participants.participant.ledger-api.index-service.max-contract-state-cache-size
   canton.participants.participant.ledger-api.index-service.max-contract-key-state-cache-size

The size of these caches determines the likelihood that a transaction using a contract/contract-key that was recently created or read will still find it in memory rather than need to query it from the database. Larger caches might be of interest in use cases where there is a big pool of ambient contracts that are consistently being fetched or used for non consuming exercises. It may also benefit those use cases where a big pool of contracts is being rotated through a create -> archive -> create-successor cycle. Consider adjusting these parameters explicitly if the performance of your specific workflow depends on large caches, and you were relying on the defaults thus far. Beware that increasing these caches will increase the memory consumption of the Ledger API, which in turn might lead to garbage collection pauses if you run the node with insufficient heap memory reserves.

Default Number of Max Transactions in the In-Memory Fan-Out

The default maximum number of transactions stored in the in-memory fan-out has been decreased by one order of magnitude from 10'000 to 1'000.

   canton.participants.participant.ledger-api.index-service.max-transactions-in-memory-fan-out-buffer-size

The in-memory fan-out allows serving the transaction streams from memory as they are finalized, rather than using the database. You should choose this buffer to be large enough such that the likeliness of applications having to stream transactions from the database is low. Generally, having a 10s buffer is sensible. Therefore, if you expect e.g. a throughput of 20 tx/s, then setting this number to 200 is sensible. The new default setting of 1000 assumes 100 tx/s. Consider adjusting these parameters explicitly if the performance of your specific workflow foresees transaction rates larger than 100 tx/s.

Target Scope for JWT Authorization

The default scope (scope field in the scope based token) for authenticating on the Ledger API using JWT is daml_ledger_api. Other scopes can be configured explicitly using the custom target scope configuration option:

   canton.participants.participant.ledger-api.auth-services.0.target-scope="custom/Scope-5:with_special_characters"

Target scope can be any case-sensitive string containing alphanumeric characters, hyphens, slashes, colons and underscores. Either the target-scope or target-audience parameter can be configured, but not both.

SQL Batching Parameter has been moved

The expert mode sql batching parameter

  canton.participants.participant.parameters.stores.max-items-in-sql-clause

has been moved to:

  canton.participants.participant.parameters.batching.max-items-in-sql-clause

Generally, we recommend to not change this parameter unless advised by support.

Explicit Settings for Database Connection Pool Sizes

The sizes of the connection pools used for interactions with database storage inside Canton nodes are determined using a dedicated formula described in the documentation article on max connection settings:

The values obtained from that formula can now be overridden using explicit configuration settings for the read, write and ledger-api connection pool sizes:

canton.participants.participant.storage.parameters.connection-allocation.num-reads
canton.participants.participant.storage.parameters.connection-allocation.num-writes
canton.participants.participant.storage.parameters.connection-allocation.num-ledger-api

Similar parameters are available for other Canton node types:

canton.sequencers.sequencer.storage.parameters.connection-allocation...
canton.mediators.mediator.storage.parameters.connection-allocation...
canton.domain-managers.domain_manager.storage.parameters.connection-allocation...

The effective connection pool sizes are reported by the Canton nodes at start-up

INFO  c.d.c.r.DbStorageMulti$:participant=participant_b - Creating storage, num-reads: 5, num-writes: 4

Console Changes

Ledger API commands

Usage of the applicationId in command submissions and completion subscriptions

Previously, the Canton console used a hard-coded "CantonConsole" as an applicationId in the command submissions and the completion subscriptions performed against the Ledger API. Now, if an access token is provided to the console, it will extract the userId from that token and use it instead. A local console will use the adminToken provided in canton.participants.<participant>.ledger-api.admin-token, whereas a remote console will use the token from canton.remote-participants.<remoteParticipant>.token

This affects the following console commands:

ledger_api.commands.submit
ledger_api.commands.submit_flat
ledger_api.commands.submit_async
ledger_api.completions.list
ledger_api.completions.list_with_checkpoint
ledger_api.completions.subscribe

You can also override the applicationId by supplying it explicitly to these commands.

Introduction of Java Bindings Compatible Console Commands

The following console commands were added to support actions with java codegen compatible data:

participant.ledger_api.javaapi.commands.submit
participant.ledger_api.javaapi.commands.submit_flat
participant.ledger_api.javaapi.commands.submit_async
participant.ledger_api.javaapi.transactions.trees
participant.ledger_api.javaapi.transactions.flat
participant.ledger_api.javaapi.acs.await
participant.ledger_api.javaapi.acs.filter
participant.ledger_api.javaapi.event_query.by_contract_id
participant.ledger_api.javaapi.event_query.by_contract_key

The following commands were replaced by their java bindings compatible equivalent (in parentheses):

participant.ledger_api.acs.await (participant.ledger_api.javaapi.acs.await)
participant.ledger_api.acs.filter (participant.ledger_api.javaapi.acs.filter)

Please note that the Scala codegen and bindings are not an officially supported feature and will be removed in a future release. These commands here are provided for your convenience but are subject to change.

New Functions to Specify a Full-blown Transaction Filter for Flat Transactions

ledger_api.transactions.flat_with_tx_filter and ledger_api.javaapi.transactions.flat_with_tx_filter are more sophisticated alternatives to ledger_api.transactions.flat and ledger_api.javaapi.transactions.flat respectively that allow to specify a full transaction filter instead of a set of parties. Consider using this if you need to specify more fine-grained filters that include template IDs, interface IDs, and/or whether you want to retrieve create event blobs for explicit disclosure.

Repair Commands

Commands around ACS migration

Console commands for ACS migration can now be used with remote nodes. This change applies to the commands in the repair namespace.

New ACS export / import repair commands

The new ACS export / import commands, repair.export_acs and repair.import_acs provide similar functionality as the existing repair.download and repair.upload commands. However, its implementation allows to evolve it better over time.

Consequently, the existing download / upload functionality has been deprecated for 2.8.0 and is going to be removed with a next release.

Transactions generated by importing an ACS have a configurable workflow ID to track ongoing imports

Contract added via the repair.party_migration.step2_import_acs and repair.import_acs commands now include a workflow ID. The ID is in the form prefix-${n}-${m}, where m is the number of transactions generated as part of the import process and n is a sequential number from 1 to m inclusive. Each transaction contains 1 or more contracts that share the ledger time of their creation. The two numbers allow you to track whether an import is being processed. You can specify a prefix with the workflow_id_prefix string parameter defined on both commands. If not specified, the prefix defaults to import-${randomly-generated-unique-identifier}.

Key Management Commands

keys.secret.rotate_node_key() console command

The console command keys.secret.rotate_node_key can now accept a name for the newly generated key.

Breaking: `owner_to_key_mappings.rotate_key` command expects a node reference

The previous owner_to_key_mappings.rotate_key is deprecated and now expects a node reference (InstanceReferenceCommon) to avoid any dangerous and/or unwanted key rotations.

Miscellaneous Commands

Domain filtering in testing commands

To improve consistency and code safety, some testing console commands now expect an optional domain alias (rather than a plain domain alias). For example, the following call

participant.testing.event_search("da")

needs to be rewritten to

participant.testing.event_search(Some("da"))

The impacted console commands are:

participant.testing.event_search
participant.testing.transaction_search

DAR vetting and unvetting commands

DAR vetting and unvetting convenience commands have been added to:

Canton admin API as PackageService.VetDar and PackageService.UnvetDar
Canton console as participant.dars.vetting.enable and participant.dars.vetting.disable

Additionally, two error codes have been introduced to allow better error reporting to the client when working with DAR vetting/unvetting: DAR_NOT_FOUND and PACKAGE_MISSING_DEPENDENCIES. Please note that these commands are alpha only and subject to change.

Deprecations

SequencerConnection.addConnection is deprecated. Use SequencerConnection.addEndpoints instead.
repair.download and repair.upload Console Commands are deprecated. Use repair.export_acs and repair.import_acs commands instead.
Removed obsolete dar sharing service.

Metrics Changes

The DB metric lookup_active_contracts is removed in favor of lookup_created_contracts and lookup_archived_contracts. This reflects the change of active contract lookup from DB: switching for a single batched active DB query to two parallel executed batch queries targeting archived and created events.
The sequencer client metric load is removed without replacement, as it was not measuring anything meaningful.

Error Code Changes

Breaking: Submission service error code change

The error code SEQUENCER_DELIVER_ERROR that could be received when submitting a transaction has been superseded by two separate new error codes: SEQUENCER_SUBMISSION_REQUEST_MALFORMED and SEQUENCER_SUBMISSION_REQUEST_REFUSED. Please migrate client applications code if you rely on the older error code.

Packaging

We have reverted the change from 2.7.0 that introduced the distribution of a separate bcprov-jdk15on-1.70.jar along the canton jar in the lib folder. This revert was also enacted in the version 2.7.1. If you use the Oracle JRE, beware that you will now need to explicitly add the bouncycastle library to the classpath when running canton.

Faster emission of command rejections

Some commands that the participant rejected locally, e.g., command deduplication errors, produced a rejection completion only after the participant observed some other traffic from the domain. Such commands now produce the rejection completion immediately.

Bugfixes

(23-023, Critical): Crash recovery issue in command deduplication store

Issue Description

On restart of a sync domain, the participant will replay pending transactions, updating the stores in case some writes were not persisted. Within the command deduplication store, existing records are compared with to be written records for internal consistency checking purposes. This comparison includes the trace context which differs on a restart and hence can cause the check to fail, aborting the startup with an IllegalArgumentException.

Affected Deployments

Participant

Impact

An affected participant can not reconnect to the given domain, which means that transaction processing is blocked.

Symptom

Upon restart, the participant refuses to reconnect to the domain, writing the following log message: ERROR c.d.c.p.s.d.DbCommandDeduplicationStore ... - An internal error has occurred. java.lang.IllegalArgumentException: Cannot update command deduplication data for ChangeIdHash

Workaround

No workaround exists. You need to upgrade to a version not affected by this issue.

Likelihood

The error happens if a participant crashes with a particular state not yet written to the database. The bug has been present since end of Nov 21 and has never been observed before, not even during testing.

Recommendation

Upgrade during your next maintenance window to a patch version not affected by this issue.

(23-022, Major): - `rotate_keys` command is dangerous and does not work when there are multiple sequencers

Issue Description

We allow the replacement of key A with key B, but we cannot guarantee that the node using key A will actually have access to key B.

Furthermore, when attempting to rotate the keys of a sequencer using the rotate_node_keys(domainManagerRef) method, it will fail if we have more than one sequencer in our environment. This occurs because they share a unique identifier (UID), and as a result, this console command not only rotates the keys of the sequencer it is called on but also affects the keys of the other sequencers.

Modified the process of finding keys for rotation in the rotate_node_keys(domainManagerRef) function to prevent conflicts among multiple sequencers that share the same UID. Additionally, we have updated the console command owner_to_key_mappings.rotate_key to expect a node reference (InstanceReferenceCommon), thereby ensuring that both the current and new keys are associated with the correct node.

Affected Deployments

All nodes (but mostly sequencer)

Impact

A node can mistakenly rotate keys that do not pertain to it. Using rotate_node_keys(domainManagerRef) to rotate a sequencer's keys when other sequencers are present will also fail and break Canton.

Symptom

When you try to rotate a sequencer's key, it catastrophically fails with a java.lang.RuntimeException: KeyNotAvailable.

Workaround

For rotate_node_keys(domainManagerRef), we have ensured that we filter the correct keys for rotation by checking both the authorized store and the local private key store.

Additionally, we have deprecated the existing owner_to_key_mappings.rotate_key and introduced a new method that requires the user to provide the node instance for which they intend to apply the key rotation. We have also implemented a validation check within this function to ensure that the current and new keys are associated with this node.

Likelihood

Every use of rotate_node_keys to rotate the keys of sequencer(s) in a multiple sequencer environment.

Recommendation

Upgrade the Canton console that you use to administrate the domain, in particular the sequencer and mediator, to a Canton version with the bug fix.

(23-019, Minor): Fixed `rotate_node_keys` command when it is used to rotate keys of sequencer(s) and mediator(s)

Issue Description

Canton has a series of console commands to rotate keys, in particular, rotate_node_keys that is used to rotate the keys of a node.

When we use the console macro rotate_node_keys to rotate the keys of a sequencer or mediator the keys are actually not rotated because we are: (1) looking at the wrong topology management store and (2) not using the associated domainManager to actually rotate the keys.

Affected deployments

Mediator and Sequencer nodes.

Impact

The rotation of the keys for a sequencer or mediator will not succeed.

Workaround

Select the correct topology management Active store (the one from the domainManager) to look for keys to rotate and call the rotate_key console command with the domainManager reference.

Likelihood

Everytime we use rotate_node_keys to rotate the keys of sequencer(s) or mediator(s).

Recommendation

Upgrade the Canton console that you use to administrate the domain, in particular the sequencer and mediator, to a Canton version with the bug fix.

(23-020, Minor): Core contract input stakeholders bug

Background

The Canton protocol includes in ViewParticipantData contracts that required to re-interpret command evaluation. These inputs are known as coreInputs. Included as part of this input contract data is contract metadata the includes details of signatories and stakeholders.

Issue Description

Where the evaluation of a command results in a contract key being resolved to a contract identifier but that identifier is not in turn resolved to a contract instance that distributed metadata associated with contract will incorrectly have the key maintainers as both the signatory and the stakeholders. A way to do this would be to execute a choice on a contract other than the keyed contract that only issues a lookupByKey on the keyed contract.

Affected protocol versions

This problem is observed in canton protocol version 4 and fixed in version 5.

Impact

The impact of this bug is invalid contract metadata is distributed with transaction views.

Workaround

A workaround would be to ensure that whenever a contract key is resolved to a contract identifier that identifier is always resolved to a contract (even if not needed). For example following the lookupByKey if the case where a contract identifier is returned the issue a fetch command on this identifier, discarding the result.

Likelihood

Unlikely. Most of the time a contract key is resolved to a contract so that some action can be performed on that contract. In this situation the metadata would be correct. The only situation this has occurred observed is in test scenarios.

Recommendation

The recommendation is that no action is required. The problem will be naturally fixed in protocol version 5 and the workaround above can be used in the unlikely case it is observed.

(23-021, Minor): Transaction view decomposition bug

Background

Transaction view decomposition is the process of taking a transaction and generating a view hierarchy whereby each view has a common set of informees. Each view additionally has a rollback scope. A child view having a rollback scope that is different from that of the parent indicates that any changes to the liveness of contracts that occurred within the child view should be disregarded upon exiting the child view. For example it would be valid for a contract to be consumed in a rolled back child view and then consumed again in a subsequent child view.

Issue Description

As the activeness of contracts is preserved across views with the same rollback scope every source transaction rollback node should be allocated a unique scope. In certain circumstances this is not happening resulting in contract activeness inconsistency.

Effected protocol versions

This problem is observed in canton protocol version 4 and fixed in version 5.

Impact

The impact of this bug is that a inconsistent transaction view hierarchy can be generated from a consistent transaction. This in turn can result in a valid transaction being rejected. The symptoms of this would be the mediator rejecting a valid transaction request on the basis of inconsistency.

Workaround

This bug only effects transactions that contain rollbacks, if use of rollbacks can be avoided this bug will not occur.

Likelihood

To encounter this bug requires a transaction that has multiple rolled back nodes in which overlapping contracts and/or keys are used. For this reason the likelihood of encountering the bug is low.

Recommendation

If this bug was to effect a client development then the best course of action would be to finalize and release canton protocol 5.

(23-024, Moderate) Participant state topology transaction may be silently ignored during cascading update

Issue Description

In some cases, participant state and mediator domain state topology transactions were silently ignored when they were sent as part of a cascading topology update (which means they were sent together with a namespace certificate). As a result, the nodes had a different view on the topology state and not all daml transactions could be run.

Affected Deployments

All nodes

Impact

A participant node might consider another participant node as inactive and therefore refuse to send transactions or invalidate transactions.

Symptom

A Daml transaction might be rejected with UNKNOWN_INFORMEES.

Workaround

Flush the topology state by running domain.participants.set_state(pid, Submission, Vip (and back to Ordinary)). This will run the update through the "incremental" update code path that is behaving correctly, which fixes the topology state of the broken node.

Likelihood

The bug is deterministic and can occur when using permissioned domains, when the participant state is received together with the namespace delegation of the domain but without the namespace delegation of the participant.

Recommendation

Upgrade to this version if you intend to use permissioned domains. If you need to fix a broken system, then upgrade to a version fixing the issue and apply the work-around to ""flush"" the topology state.

(23-025, Minor) PingService stops working after a LedgerAPI crash

After an Indexer restart in the Ledger API or any error causing the client transaction streams to fail, the PingService stops working.

Affected Deployments

The participant node

Impact

When the Ledger API encounters an error that leads to cancelling the client connections while the participant node does not become passive, the PingService cannot continue processing commands.

Symptom

Ping commands issued in the PingService are timing out. Additionally, the participant might appear unhealthy if configured to report health by using the PingService (i.e. configured with monitoring.health.check.type = ping).

Workaround

Restart the participant node.

Likelihood

This bug occurs consistently when there is an error in the Ledger API, such as a DB overloaded issue that causes the Ledger API Indexer to restart. For this bug to occur, the participant node must not transition to passive state. If it transitions to passive and then back to active, the bug should not reproduce.

Recommendation

If the system is subject to frequent transient errors in the Ledger API (e.g. flaky Index database) or consistently high load, update to this version in order to avoid reproducibility.

(23-026, Minor) Non-graceful shutdown of the participant or the Ledger API

Participant may shut down ungracefully if there are still completing CommandService submissions or, in extreme cases, the Ledger API can restart during normal operations.

Affected Deployments

The participant node

Impact

No significant operational impact.

Symptoms

On shutdown, an exception including IllegalStateException("Promise already completed.") is logged.
Pending CommandService submissions are not completed gracefully with SERVER_IS_SHUTTING_DOWN.
In extreme cases, the issue can trigger a Ledger API restart during normal operation.

Workaround

Not applicable as the effect is mostly aesthetic.

Likelihood

This ungraceful shutdown is only likely under heavy usage of CommandService at the same time with the participant shutdown. The likelihood of this bug triggering a Ledger API restart is very small as multiple conditions need to be met:

Submissions with the same (submissionId, commandId, applicationId, submitters) change key are sent concurrently to both the CommandSubmissionService and the CommandService.
The chosen deduplication duration for the submissions is small enough to allow them to succeed within a small timeframe (rather concurrently).

Recommendation

Upgrade when convenient.

(23-027, Minor) Expired gRPC request deadlines crash requests in CommandService

The CommandService errors out when confronted with an expired gRPC request deadline.

Affected Deployments

The participant node

Impact

If encountered repeatedly (up to the maximum-in-flight configuration limit for the CommandService), the CommandService can appear saturated and reject new commands.

Symptoms

When a command request is submitted via CommandService.submitAndWait and its variants with providing a gRPC request deadline, the request can fail with an INTERNAL error reported to the client and a log message with ERROR level is logged on the participant.

Workaround

Restart the participant and use a higher gRPC request deadline.

Likelihood

This bug is likely to happen if the gRPC request deadline is small enough for it to expire upon arriving at the participant's Ledger API.

Recommendation

Upgrade when convenient.

(23-028, Minor) PingService completions re-subscription loop

A Ledger API restart following a pruning event can lead to a non-functioning PingService

Affected Deployments

The participant node

Impact

The PingService is not functional after this error is encountered.

Symptoms

Ping commands issued in the PingService are timing out.
The participant node is continuously logging re-subscription errors with PARTICIPANT_PRUNED_DATA_ACCESSED every one second.
Additionally, the participant might appear unhealthy if configured to report health by using the PingService.

Workaround

Restart the participant node.

Likelihood

This bug is likely when in the same participant active session, a pruning request is followed by Ledger API restart (caused by transient errors such as a DB connectivity). For this bug to occur, the participant node must not transition to passive state. If it transitions to passive and then back to active, the bug should not reproduce.

Recommendation

Upgrade when convenient.

Minor UX fixes

Added a log warning on scheduled background pruning problems, so that persistent pruning issues are more easily detected by Canton node operators.
Invalid or duplicate node names in configuration files will now be reported on startup before exiting.

Compatibility

The following Canton protocol versions are supported:

Dependency	Version
Canton protocol versions	3, 4, 5

Canton has been tested against the following versions of its dependencies:

Dependency	Version
Java Runtime	OpenJDK 64-Bit Server VM Zulu11.66+15-CA (build 11.0.20+8-LTS, mixed mode)
Postgres	Recommended: PostgreSQL 12.15 (Debian 12.15-1.pgdg120+1) – Also tested: PostgreSQL 11.16 (Debian 11.16-1.pgdg90+1), PostgreSQL 13.12 (Debian 13.12-1.pgdg120+1), PostgreSQL 14.8 (Debian 14.8-1.pgdg120+1), PostgreSQL 15.3 (Debian 15.3-1.pgdg120+1)
Oracle	19.20.0

Note: We no longer test with Postgres 10 as its support has been discontinued.

What's Coming

We are currently working on

zero downtime distributed smart contract upgrading (for 2.9)
multi-domain general availability (per 3.0)
performance improvements (for 3.0)