Global Workflow Composition that is Scalable, Secure, and GDPR-compliant
Canton 2.8.5 has been released on April 26, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory. Please also consult the full documentation of this release.
This is a maintenance release, containing a bugfix. Users are advised to upgrade during their maintenance window.
When a topology change is sent to the sequencer and there is a problem with the transmission or sequencing, the topology dispatcher waits forever to receive back the corresponding topology transaction. Later topology changes queue behind this outstanding topology change until the node is restarted.
Participant, Domain and Domain Topology Manager nodes
All 2.3-2.7 2.8.0-2.8.4
The node cannot issue topology changes anymore.
Varies with the affected topology transaction: Party allocations time out, uploaded packages cannot be used in transactions (NO_DOMAIN_FOR_SUBMISSION), cryptographic keys can not be changed, etc.
The failure of sequencing can be seen in DEBUG logging on the node:
DEBUG c.d.c.t.StoreBasedDomainOutbox:... tid:1da8f7fff488dad2fc4c9c0177633a7e - Attempting to push .. topology transactions to Domain ...
DEBUG c.d.c.s.c.t.GrpcSequencerClientTransport:... tid:1da8f7fff488dad2fc4c9c0177633a7e - Sending request send-async-versioned/f77dd135-9c6a-4bd8-a6ed-a9a9f3ef43ca to sequencer.
DEBUG c.d.c.s.c.SendTracker:... tid:1da8f7fff488dad2fc4c9c0177633a7e - Sequencer send [f77dd135-9c6a-4bd8-a6ed-a9a9f3ef43ca] has timed out at ...
where the last log line by default comes 5 minutes after the first two.
Restart the node.
Occurs for unstable network conditions between the node and the sequencer (e.g., frequent termination of a subset of the connections by firewalls) and when the sequencer silently drops submission requests.
Upgrade to 2.8.5
The following Canton protocol versions are supported:
Dependency | Version |
---|---|
Canton protocol versions | 3, 4, 5 |
Canton has been tested against the following versions of its dependencies:
Dependency | Version |
---|---|
Java Runtime | OpenJDK 64-Bit Server VM Zulu11.70+15-CA (build 11.0.22+7-LTS, mixed mode) |
Postgres | Recommended: PostgreSQL 12.18 (Debian 12.18-1.pgdg120+2) – Also tested: PostgreSQL 11.16 (Debian 11.16-1.pgdg90+1), PostgreSQL 13.14 (Debian 13.14-1.pgdg120+2), PostgreSQL 14.11 (Debian 14.11-1.pgdg120+2), PostgreSQL 15.6 (Debian 15.6-1.pgdg120+2) |
Oracle | 19.20.0 |
Canton 2.8.4 has been released on April 16, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory. Please also consult the full documentation of this release.
This is a maintenance release containing small improvements and bug fixes.
When a node encounters a fatal failure that Canton unexpectedly cannot handle gracefully, the new default behavior is that the node will exit/stop the process and relies on an external process or service monitor to restart the node's process. Without this exit, the node would remain in a half-broken state, requiring a manual restart.
Failures that are currently logged as an error but not automatically recovered from are:
The new behavior can be reverted by setting: canton.parameters.exit-on-fatal-failures = false
in the configuration, but
we encourage users to adopt the new behaviour.
ABORTED(ACCESS_TOKEN_EXPIRED)
error is returned.Before, error causes of the documented errors were truncated after 512 characters. This behaviour is necessary when transporting errors through GRPC, as GRPC imposes size limits, but the limit was also applied to errors in logs, which caused relevant information to be truncated. Now, the 512 bytes limit is not applied to the errors written to the logs anymore.
When a token used in the ledger api request to open a stream expires, the stream is terminated. This normally happens several minutes or hours after the stream initiation. Users can now configure a grace period that will protect the stream termination beyond the token expiry:
canton.participants.participant1.parameters.ledger-api-server-parameters.token-expiry-grace-period-for-streams=600.seconds
Grace period can be any non-negative duration where both the value and the units must be defined e.g. "600.seconds" or "10.minutes".
When parameter is omitted, grace period defaults to zero. When the configured value is Inf
the stream is never terminated.
During participant failover the newly active participant does not complete reconnecting to the domain and fail to declare itself as active in a timely manner. This can happen when the passive replica, which has become active, has been idle for a longer time while there has been many transactions processed by the former active replica.
Participant
2.3-2.6, 2.7.0-2.7.7 2.8.0-2.8.3
During participant failover the newly active participant does not become active and does not serve commands and transactions in a timely manner.
"One of the last log entries before the participant only outputs storage health checks anymore is like:
INFO c.d.c.p.s.d.DbMultiDomainEventLog:participant=participant1 Fetch unpublished from domain::122345... up to Some(1234)
The participant reports itself as not active and not connected to any domains in its health status."
Restart the participant node.
It depends on how long the passive replica has been idle, the number and size of transactions that the active replica has processed in the meantime. These factors influence how timely the failover and in particular reconnect to the domain completes.
Upgrade to 2.8.4
The following Canton protocol versions are supported:
Dependency | Version |
---|---|
Canton protocol versions | 3, 4, 5 |
Canton has been tested against the following versions of its dependencies:
Dependency | Version |
---|---|
Java Runtime | OpenJDK 64-Bit Server VM Zulu11.70+15-CA (build 11.0.22+7-LTS, mixed mode) |
Postgres | Recommended: PostgreSQL 12.18 (Debian 12.18-1.pgdg120+2) – Also tested: PostgreSQL 11.16 (Debian 11.16-1.pgdg90+1), PostgreSQL 13.14 (Debian 13.14-1.pgdg120+2), PostgreSQL 14.11 (Debian 14.11-1.pgdg120+2), PostgreSQL 15.6 (Debian 15.6-1.pgdg120+2) |
Oracle | 19.20.0 |
Canton 2.3.19 has been released on March 28, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory. Please also consult the full documentation of this release.
Canton 2.3.19 has been released as part of Daml 2.3.19 with no additional change on Canton
The following Canton protocol and Ethereum sequencer contract versions are supported:
Dependency | Version |
---|---|
Canton protocol versions | 2.0.0, 3.0.0 |
Ethereum contract versions | 1.0.0, 1.0.1 |
Canton has been tested against the following versions of its dependencies:
Dependency | Version |
---|---|
Java Runtime | OpenJDK 64-Bit Server VM 18.9 (build 11.0.16+8, mixed mode, sharing) |
Postgres | postgres (PostgreSQL) 14.11 (Debian 14.11-1.pgdg120+2) |
Oracle | 19.15.0 |
Besu | besu/v21.10.9/linux-x86_64/openjdk-java-11 |
Fabric | 2.2.2 |
Canton 2.7.9 has been released on March 20, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory. Please also consult the full documentation of this release.
This is a maintenance release, providing a workaround for a participant crash recovery issue on protocol version 4.
This release provides a workaround for a specific participant crash recovery issue under load on protocol version 4 triggered by the bug 23-021 (fixed in protocol version 5).
The workaround can be enabled for a participant through the following configuration, but should only be done when advised so by Digital Asset:
canton.participants.XXX.parameters.disable-duplicate-contract-check = yes
.
The following Canton protocol and Ethereum sequencer contract versions are supported:
Dependency | Version |
---|---|
Canton protocol versions | 3, 4, 5 |
Canton has been tested against the following versions of its dependencies:
Dependency | Version |
---|---|
Java Runtime | OpenJDK 64-Bit Server VM Zulu11.70+15-CA (build 11.0.22+7-LTS, mixed mode) |
Postgres | postgres (PostgreSQL) 14.11 (Debian 14.11-1.pgdg120+2) |
Oracle | 19.18.0 |
Canton 2.3.18 has been released on March 15, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory. Please also consult the full documentation of this release.
Canton 2.3.18 has been released as part of Daml 2.3.18 with no additional change on Canton.
The following Canton protocol and Ethereum sequencer contract versions are supported:
Dependency | Version |
---|---|
Canton protocol versions | 2.0.0, 3.0.0 |
Ethereum contract versions | 1.0.0, 1.0.1 |
Canton has been tested against the following versions of its dependencies:
Dependency | Version |
---|---|
Java Runtime | OpenJDK 64-Bit Server VM 18.9 (build 11.0.16+8, mixed mode, sharing) |
Postgres | postgres (PostgreSQL) 14.11 (Debian 14.11-1.pgdg120+2) |
Oracle | 19.15.0 |
Besu | besu/v21.10.9/linux-x86_64/openjdk-java-11 |
Fabric | 2.2.2 |
Canton 2.8.3 has been released on February 21, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory. Please also consult the full documentation of this release.
This is a maintenance release, fixing critical issue which can occur if overly large transactions are submitted to the participant. The bootstrap domain command has been slightly improved, but it may now throw an error if you try to bootstrap an already initialized domain while the domain manager node is still starting up.
Console commands that allow to download an ACS snapshot now take a new mandatory argument to indicate whether the snapshot will be used in the context of a party offboarding (party replication or not). This allows Canton to performance additional checks and makes party offboarding safer.
Affected console commands:
participant.repair.export_acs
participant.repair_download_acs
(deprecated method)New argument: partiesOffboarding: Boolean
.
When replicating a party who is a maintainer on a contract key, the contract cannot be exercised anymore
Participant
Contract become unusable
The following error is emitted when trying to exercise a choice on the contract:
java.lang.IllegalStateException:
Unknown keys are to be reassigned. Either the persisted ledger state corrupted
or this is a malformed transaction. Unknown keys:
None.
Deterministic.
Do not use the party migration macros with contract keys in version 2.8.1. Upgrade to 2.8.2 if you want to use them.
If a transaction with a very large number of key updates is submitted to the participant, the SQL query to update the contract keys table will fail, leading to a database retry loop, stopping transaction processing. The issue is caused by a limit of 32k rows in a prepared statement in the JDBC Postgres driver.
Participant on Postgres
2.3-2.6, 2.7.0-2.7.6 2.8.0-2.8.1
Transactions can be submitted to the participant, but the transaction stream stops emitting transactions and commands don't appear to succeed.
A stuck participant, continuously logging about "Now retrying operation 'com.digitalasset.canton.participant.store.db.DbContractKeyJournal.addKeyStateUpdates'" and "Tried to send an out-of-range integer as a 2-byte value: 40225"
Upgrade to a version containing the fix. Alternatively, such a transaction may be ignored, but this must be done with care to avoid a ledger fork in case of multiple involved participants. Contact support.
Deterministic with very large (32k key updates) transactions.
Upgrade at your convenience or if experiencing the error.
Given a Postgres HA setup with a write and read replica using AWS RDS. During DB failover, Canton connects and remains connected to the former write replica, which results in the DB connections being read-only. This fails all write DB operations on which we retry indefinitely.
All nodes with Postgres HA. Only observed with AWS RDS so far, not on Azure.
All 2.3-2.6 2.7.0-2.7.6 2.8.0-2.8.1
A node becomes unusable due to all DB write operations failing and being retried indefinitely.
The following kind of exceptions are logged:
org.postgresql.util.PSQLException: ERROR: cannot execute UPDATE in a read-only transaction
Restart the node to reconnect to the current write replica. Set DNS TTL in JVM networkaddress.cache.ttl=0
reduces the likelihood of this problem, but disables DNS caching entirely
Probable in case of DB failover, only observed in AWS RDS so far.
Upgrade to 2.8.2 if you are using AWS RDS
A race condition in the domain bootstrap macro and domain manager node initialisation may result in a domain manager overriding the previously stored domain parameters. This can cause the domain manager to be unable to reconnect to the domain after upgrading to version 2.7 or beyond without a hard domain migration to pv=5, as with 2.7, the default protocol version changed to pv=5.
Domain Manager Nodes
The domain manager node is unable to connect to the sequencer. Topology transactions such as party additions have no effect on the domain, as they are not verified and forwarded by the domain manager.
Parties added to the participant do not appear on the domain. Transactions referring to such parties are rejected with UNKNOWN_INFORMEE. The domain manager logs complain about "The unversioned subscribe endpoints must be used with protocol version 4"
If you don't run bootstrap_domain after the domain was already initialised, the issue won't happen. If a node is affected, you can recover the node explicitly by seting the protocol version to v=4 in the configuration and use the reset-stored-static-config parameter to correct the wrongly stored parameters.
Race condition which can happen if you run the bootstrap_domain command repeatedly on an already bootstrapped domain during startup of the domain manager node. As there is no need to run the bootstrap_domain command repeatedly, this issue can easily be avoided.
Do not run the bootstrap_domain command after the domain was initialised. A fix has been added to prevent future mistakes.
A bad ledger api client request results in an error that contains a large metadata resource list and fails the serialization in the netty layer. As a consequence what is sent to the client is an INTERNAL
error that doesn't correspond to the actual problem.
Participant Nodes
Incorrect error returned to the ledger client which may complicate trouble-shooting
Ledger client observes following error:
INTERNAL: RST_STREAM closed stream. HTTP/2 error code: PROTOCOL_ERROR
At the same time the participant node reports following error:
io.grpc.netty.NettyServerHandler - Stream Error
io.netty.handler.codec.http2.Http2Exception$HeaderListSizeException: Header size exceeded max allowed size (8192)
Rely on participant logs when investigating ledger client issues with logs containing the message
INTERNAL: RST_STREAM closed stream. HTTP/2 error code: PROTOCOL_ERROR
It is rather difficult for ledger api clients to create bad requests that will end up in errors with long list of resources. One example is when client requests a long list of non existing templates in a filter of a ACS ledger api request. There was an unrelated bug in the trigger service that could produce long list of templates when AllInDar construction was used in the trigger installation invocation.
Upgrade at your convenience.
The following Canton protocol versions are supported:
Dependency | Version |
---|---|
Canton protocol versions | 3, 4, 5 |
Canton has been tested against the following versions of its dependencies:
Dependency | Version |
---|---|
Java Runtime | OpenJDK 64-Bit Server VM Zulu11.66+15-CA (build 11.0.20+8-LTS, mixed mode) |
Postgres | Recommended: PostgreSQL 12.18 (Debian 12.18-1.pgdg120+2) – Also tested: PostgreSQL 11.16 (Debian 11.16-1.pgdg90+1), PostgreSQL 13.14 (Debian 13.14-1.pgdg120+2), PostgreSQL 14.11 (Debian 14.11-1.pgdg120+2), PostgreSQL 15.6 (Debian 15.6-1.pgdg120+2) |
Oracle | 19.20.0 |
Canton 2.8.1 has been released on January 31, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory. Please also consult the full documentation of this release.
This is a maintenance release of Canton including reliability fixes and user experience improvements. Users of the Fabric driver are encouraged to upgrade at their disposal.
We have reworked the reference configuration example. The examples/03_advanced_configuration
example has been replaced
by a set of reference configuration files which can be found in the config
directory. While the previous example
contained pieces which could be assembled into a full configuration, the new examples now contain the full configuration
which can be simplified by removing parts which are not needed.
The installation guide <https://docs.daml.com/canton/usermanual/installation.html>
_ has been updated accordingly.
The enterprise version now supports replicating a party from one participant node to another for either migration or
to have multiple participants hosting the same party. Please consult the documentation <https://docs.daml.com/2.9.0/canton/usermanual/identity_management.html#replicate-party-to-another-participant-node>
_
on how to use this feature.
We have improved the background journal cleaning to produce less load on the database by using smaller transactions to clean up the journal.
The metrics for the execution services have been removed:
Fabric Ledger block processing runs into an unintended shutdown and fails to process blocks when the in-memory blocks exceeds 5000. This is caused when catching up after downtime and the Fabric Ledger size has increased substantially in the meantime.
Fabric Sequencer Node
2.3 - 2.7 and 2.8.0
The sequencer stops feeding new blocks.
Participant gets stuck in an old state and does not visibly catch up against a Fabric Ledger.
Therefore, a domains.reconnect
call on the participant may appear as if it is hanging.
On the sequencer node, each processed message is logged using Observed Send with messageId
including a
timestamp. Once the emission of these log lines stops, the sequencer is stuck.
Restart the sequencer node if it is stuck.
Rarely, only occurs when catching up to a Fabric Ledger that the sequencer node has been an active part of before.
Upgrade to 2.8.1 at your convenience.
The following Canton protocol versions are supported:
Dependency | Version |
---|---|
Canton protocol versions | 3, 4, 5 |
Canton has been tested against the following versions of its dependencies:
Dependency | Version |
---|---|
Java Runtime | OpenJDK 64-Bit Server VM Zulu11.66+15-CA (build 11.0.20+8-LTS, mixed mode) |
Postgres | Recommended: PostgreSQL 12.17 (Debian 12.17-1.pgdg120+1) – Also tested: PostgreSQL 11.16 (Debian 11.16-1.pgdg90+1), PostgreSQL 13.13 (Debian 13.13-1.pgdg120+1), PostgreSQL 14.10 (Debian 14.10-1.pgdg120+1), PostgreSQL 15.5 (Debian 15.5-1.pgdg120+1) |
Oracle | 19.20.0 |
Canton 2.8.0 has been released on December 15, 2023. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory. Please also consult the full documentation of this release.
We are excited to announce Canton 2.8.0, which offers some great additional features:
See below for details.
Please upgrade to this version as soon as possible. Version 2.7 will not be maintained after March 2024.
Explicit contract disclosure allows you to delegate contract read rights to a non-stakeholder using off-ledger data distribution.
The feature is now available in Beta and enabled by default. or more information, see the documentation.
participant.ledger-api.explicit-disclosure-unsafe
feature flag is replaced by participant.ledger-api.enable-explicit-disclosure
which is by default set to true
. To disable the feature, set the new flag to false
.Distributed tracing is a technique for troubleshooting performance issues in a microservices environment like Daml Enterprise. Canton supports distributed tracing, and the support on the Ledger API is improving. In this release, the client applications gain the ability to extract trace and span IDs from past transactions and completions, so that distributed traces can continue in follow-up commands.
To learn how to extend your application to support distributed tracing, see the documentation.
Trace contexts are now included in the gRPC messages return in Ledger API streams and point-wise queries. This change affects the following transaction and command completion service calls:
TransactionService.GetTransactions
TransactionService.GetTransactionTrees
TransactionService.GetTransactionByEventId
TransactionService.GetTransactionById
TransactionService.GetFlatTransactionByEventId
TransactionService.GetFlatTransactionById
CompletionService.CompletionStream
Trace context enables client applications that were not the submitters of the original request to pick up the initial spans and continue them in follow-up requests. This allows multi-step workflows to be adorned with contiguous chains of related spans.
This is a purely additive change.
Use the event query service to obtain a party-specific view of contract events.
The gRPC API provides ledger streams to off-ledger components that maintain a queryable state. This service allows you to make simple event queries without off-ledger components like the JSON API.
Using the event query service, you can create, retrieve, and archive events associated with a contract ID or contract key. The API returns only those events where at least one of the requesting parties is a stakeholder of the contract. If the contract is still active, the archive_event
is unset.
Contract keys can be used by multiple contracts over time. The latest contract events are returned first. To access earlier contract key events, use the continuation_token
returned in the GetEventsByContractKeyResponse
in a subsequent GetEventsByContractKeyRequest
.
If no events match the request criteria or the requested events are not visible to the requesting parties, an empty structure is returned. Events associated with consumed contracts are returned until they are pruned.
GetEventsByContractId
request.GetEventsByContractKey
request.This is a purely additive change.
Pruning presents a trade-off between limiting ledger storage space and being able to query ledger history far into the past.
Only pruning participant-internal state strikes a balance by deleting exclusively internal state. As a result, applications
can continue to query historic portions of the ledger, but internal-only pruning frees up less storage space than regular pruning.
Previously, internal pruning was available only via the "manual" participant.pruning.prune_internally
command. With this release, pruning participant internal-only state also becomes available through automatic pruning as well.
participant.pruning.set_participant_schedule
command's prune_internally_only
parameter.prune_internally_only
setting via the newly introduced participant.pruning.get_participant_schedule
command.This is a purely additive change.
We've cleaned up our transaction processing logging to make it easier to understand what is happening.
Now, significant events on all nodes are logged at INFO
levels and including relevant information.
Commands submitted to the Ledger API now respects the grpc deadlines:
If a request reaches the command processing layer with an already-expired gRPC deadline, the command will not be sent for submission.
Instead, the request is rejected with a new self-service error code REQUEST_DEADLINE_EXCEEDED
, which informs the client
that the command is guaranteed not to have been sent for execution to the ledger.
If you submit a command that runs for a very long time, the Ledger API will now reject the command with the new self-service error
code INTERPRETATION_TIME_EXCEEDED
when the transaction would reach the ledger time tolerance limit based on the
submission time.
Protocol versions 3 and 4 are now marked as deprecated and will be removed in the next minor release of Canton. Protocol version 5 should be preferred for any new deployment.
The expected KMS wrapper-key configuration value has changed from:
crypto.private-key-store.encryption.wrapper-key-id = { str = "..."}
to a simple string:
crypto.private-key-store.encryption.wrapper-key-id = "..."
Nodes will now start up in parallel. The startup parallelism can be configured by setting:
canton.parameters.startup-parallelism = 1 // defaults to number of threads.
The configuration fields schema-migration-attempt-backoff
and schema-migration-attempts
for the indexer were removed.
The following config lines will have to be removed, if they exist:
participants.participant.parameters.ledger-api-server-parameters.indexer.schema-migration-attempt-backoff
participants.participant.parameters.ledger-api-server-parameters.indexer.schema-migration-attempts
The configuration fields max-event-cache-weight
and max-contract-cache-weight
for the ledger api server were removed.
The following config lines will have to be removed, if they exist:
participants.participant.ledger-api.max-event-cache-weight
participants.participant.ledger-api.max-contract-cache-weight
By default, Canton will log the config values on startup, as this has turned out to be useful for troubleshooting, in order to understand what configuration is related to the given log-file.
This feature can be turned off by setting
canton.monitoring.logging.log-config-on-startup = false
Logging the configuration including all default values can be turned on using
canton.monitoring.logging.log-config-with-defaults = true
Note that this will log all available settings, which includes parameters which we do not recommend changing.
Confidential data will not be logged but replaced by xxxx
.
The mediator finalized response cache can now be configured using:
canton.mediators.mediator.caching.finalized-mediator-requests = 1000 // default
which allows you to limit memory consumption in the presence of a transaction with a very large number of views.
The default sizes of the contract state and contract key state caches has been decreased by one order of magnitude from 100'000 to 10'000.
canton.participants.participant.ledger-api.index-service.max-contract-state-cache-size
canton.participants.participant.ledger-api.index-service.max-contract-key-state-cache-size
The size of these caches determines the likelihood that a transaction using a contract/contract-key that was recently created or read will still find it
in memory rather than need to query it from the database. Larger caches might be of interest in use cases where there is a big pool of ambient
contracts that are consistently being fetched or used for non consuming exercises. It may also benefit those use cases where a big pool of contracts
is being rotated through a create -> archive -> create-successor
cycle.
Consider adjusting these parameters explicitly if the performance of your specific workflow depends on large caches, and you were relying on the defaults thus far.
Beware that increasing these caches will increase the memory consumption of the Ledger API, which in turn might lead to garbage collection pauses if you run the node with insufficient heap memory reserves.
The default maximum number of transactions stored in the in-memory fan-out has been decreased by one order of magnitude from 10'000 to 1'000.
canton.participants.participant.ledger-api.index-service.max-transactions-in-memory-fan-out-buffer-size
The in-memory fan-out allows serving the transaction streams from memory as they are finalized, rather than using the database. You should choose this buffer to be large enough such that the likeliness of applications having to stream transactions from the database is low. Generally, having a 10s buffer is sensible. Therefore, if you expect e.g. a throughput of 20 tx/s, then setting this number to 200 is sensible. The new default setting of 1000 assumes 100 tx/s. Consider adjusting these parameters explicitly if the performance of your specific workflow foresees transaction rates larger than 100 tx/s.
The default scope (scope field in the scope based token) for authenticating on the Ledger API using JWT is daml_ledger_api
.
Other scopes can be configured explicitly using the custom target scope configuration option:
canton.participants.participant.ledger-api.auth-services.0.target-scope="custom/Scope-5:with_special_characters"
Target scope can be any case-sensitive string containing alphanumeric characters, hyphens, slashes, colons and underscores.
Either the target-scope
or target-audience
parameter can be configured, but not both.
The expert mode sql batching parameter
canton.participants.participant.parameters.stores.max-items-in-sql-clause
has been moved to:
canton.participants.participant.parameters.batching.max-items-in-sql-clause
Generally, we recommend to not change this parameter unless advised by support.
The sizes of the connection pools used for interactions with database storage inside Canton nodes are determined using a dedicated formula described in the documentation article on max connection settings:
The values obtained from that formula can now be overridden using explicit configuration settings for the read, write and ledger-api connection pool sizes:
canton.participants.participant.storage.parameters.connection-allocation.num-reads
canton.participants.participant.storage.parameters.connection-allocation.num-writes
canton.participants.participant.storage.parameters.connection-allocation.num-ledger-api
Similar parameters are available for other Canton node types:
canton.sequencers.sequencer.storage.parameters.connection-allocation...
canton.mediators.mediator.storage.parameters.connection-allocation...
canton.domain-managers.domain_manager.storage.parameters.connection-allocation...
The effective connection pool sizes are reported by the Canton nodes at start-up
INFO c.d.c.r.DbStorageMulti$:participant=participant_b - Creating storage, num-reads: 5, num-writes: 4
Previously, the Canton console used a hard-coded "CantonConsole" as an applicationId in the command submissions and the completion subscriptions performed against the Ledger API.
Now, if an access token is provided to the console, it will extract the userId from that token and use it instead. A local console will use the adminToken provided in canton.participants.<participant>.ledger-api.admin-token
, whereas a remote console will use the token from canton.remote-participants.<remoteParticipant>.token
This affects the following console commands:
ledger_api.commands.submit
ledger_api.commands.submit_flat
ledger_api.commands.submit_async
ledger_api.completions.list
ledger_api.completions.list_with_checkpoint
ledger_api.completions.subscribe
You can also override the applicationId by supplying it explicitly to these commands.
The following console commands were added to support actions with java codegen compatible data:
participant.ledger_api.javaapi.commands.submit
participant.ledger_api.javaapi.commands.submit_flat
participant.ledger_api.javaapi.commands.submit_async
participant.ledger_api.javaapi.transactions.trees
participant.ledger_api.javaapi.transactions.flat
participant.ledger_api.javaapi.acs.await
participant.ledger_api.javaapi.acs.filter
participant.ledger_api.javaapi.event_query.by_contract_id
participant.ledger_api.javaapi.event_query.by_contract_key
The following commands were replaced by their java bindings compatible equivalent (in parentheses):
participant.ledger_api.acs.await
(participant.ledger_api.javaapi.acs.await
)participant.ledger_api.acs.filter
(participant.ledger_api.javaapi.acs.filter
)Please note that the Scala codegen and bindings are not an officially supported feature and will be removed in a future release. These commands here are provided for your convenience but are subject to change.
ledger_api.transactions.flat_with_tx_filter
and ledger_api.javaapi.transactions.flat_with_tx_filter
are more
sophisticated alternatives to ledger_api.transactions.flat
and ledger_api.javaapi.transactions.flat
respectively
that allow to specify a full transaction filter instead of a set of parties. Consider using this if you need to
specify more fine-grained filters that include template IDs, interface IDs, and/or whether you want to retrieve
create event blobs for explicit disclosure.
Console commands for ACS migration can now be used with remote nodes. This change applies to the commands in the repair
namespace.
The new ACS export / import commands, repair.export_acs
and repair.import_acs
provide similar functionality as the
existing repair.download
and repair.upload
commands. However, its implementation allows to evolve it better over
time.
Consequently, the existing download / upload functionality has been deprecated for 2.8.0 and is going to be removed with a next release.
Contract added via the repair.party_migration.step2_import_acs
and repair.import_acs
commands now include a
workflow ID. The ID is in the form prefix-${n}-${m}
, where m
is the number of transactions generated as part of
the import process and n
is a sequential number from 1 to m
inclusive. Each transaction contains 1 or more
contracts that share the ledger time of their creation. The two numbers allow you to track whether an import is being
processed. You can specify a prefix with the workflow_id_prefix
string parameter defined on both commands. If not
specified, the prefix defaults to import-${randomly-generated-unique-identifier}
.
The console command keys.secret.rotate_node_key
can now accept a name for the newly generated key.
owner_to_key_mappings.rotate_key
command expects a node referenceThe previous owner_to_key_mappings.rotate_key
is deprecated and now expects a node reference (InstanceReferenceCommon
)
to avoid any dangerous and/or unwanted key rotations.
To improve consistency and code safety, some testing console commands now expect an optional domain alias (rather than a plain domain alias). For example, the following call
participant.testing.event_search("da")
needs to be rewritten to
participant.testing.event_search(Some("da"))
The impacted console commands are:
participant.testing.event_search
participant.testing.transaction_search
DAR vetting and unvetting convenience commands have been added to:
PackageService.VetDar
and PackageService.UnvetDar
participant.dars.vetting.enable
and participant.dars.vetting.disable
Additionally, two error codes have been introduced to allow better error reporting to the client when working with DAR vetting/unvetting: DAR_NOT_FOUND
and PACKAGE_MISSING_DEPENDENCIES
.
Please note that these commands are alpha only and subject to change.
SequencerConnection.addConnection
is deprecated. Use SequencerConnection.addEndpoints
instead.repair.download
and repair.upload
Console Commands are deprecated. Use repair.export_acs
and repair.import_acs
commands instead.lookup_active_contracts
is removed in favor of lookup_created_contracts
and lookup_archived_contracts
. This reflects the change of active contract lookup from DB: switching for a single batched active DB query to two parallel executed batch queries targeting archived and created events.load
is removed without replacement, as it was not measuring anything meaningful.The error code SEQUENCER_DELIVER_ERROR
that could be received when submitting a transaction has been superseded by two separate new error codes:
SEQUENCER_SUBMISSION_REQUEST_MALFORMED
and SEQUENCER_SUBMISSION_REQUEST_REFUSED
. Please migrate client applications code if you rely on the older error code.
We have reverted the change from 2.7.0 that introduced the distribution of a separate bcprov-jdk15on-1.70.jar
along the canton jar in the lib
folder. This revert was also enacted in the version 2.7.1.
If you use the Oracle JRE, beware that you will now need to explicitly add the bouncycastle library to the classpath when running canton.
Some commands that the participant rejected locally, e.g., command deduplication errors, produced a rejection completion only after the participant observed some other traffic from the domain. Such commands now produce the rejection completion immediately.
On restart of a sync domain, the participant will replay pending transactions, updating the stores in case some writes were not persisted. Within the command deduplication store, existing records are compared with to be written records for internal consistency checking purposes. This comparison includes the trace context which differs on a restart and hence can cause the check to fail, aborting the startup with an IllegalArgumentException.
Participant
An affected participant can not reconnect to the given domain, which means that transaction processing is blocked.
Upon restart, the participant refuses to reconnect to the domain, writing the following log message: ERROR c.d.c.p.s.d.DbCommandDeduplicationStore ... - An internal error has occurred. java.lang.IllegalArgumentException: Cannot update command deduplication data for ChangeIdHash
No workaround exists. You need to upgrade to a version not affected by this issue.
The error happens if a participant crashes with a particular state not yet written to the database. The bug has been present since end of Nov 21 and has never been observed before, not even during testing.
Upgrade during your next maintenance window to a patch version not affected by this issue.
rotate_keys
command is dangerous and does not work when there are multiple sequencersWe allow the replacement of key A with key B, but we cannot guarantee that the node using key A will actually have access to key B.
Furthermore, when attempting to rotate the keys of a sequencer using the rotate_node_keys(domainManagerRef)
method,
it will fail if we have more than one sequencer in our environment.
This occurs because they share a unique identifier (UID), and as a result, this console command not only rotates the
keys of the sequencer it is called on but also affects the keys of the other sequencers.
Modified the process of finding keys for rotation in the rotate_node_keys(domainManagerRef)
function to prevent conflicts among multiple sequencers that share the same UID.
Additionally, we have updated the console command owner_to_key_mappings.rotate_key
to
expect a node reference (InstanceReferenceCommon
), thereby ensuring that both the current
and new keys are associated with the correct node.
All nodes (but mostly sequencer)
A node can mistakenly rotate keys that do not pertain to it. Using rotate_node_keys(domainManagerRef)
to rotate a sequencer's keys when other sequencers are present will also fail and break Canton.
When you try to rotate a sequencer's key, it catastrophically fails with a java.lang.RuntimeException: KeyNotAvailable
.
For rotate_node_keys(domainManagerRef)
, we have ensured that we filter the correct keys for rotation by checking both the authorized store and the local private key store.
Additionally, we have deprecated the existing owner_to_key_mappings.rotate_key
and introduced a new method that requires the user to provide the node instance for which they intend to apply the key rotation. We have also implemented a validation check within this function to ensure that the current and new keys are associated with this node.
Every use of rotate_node_keys
to rotate the keys of sequencer(s) in a multiple sequencer environment.
Upgrade the Canton console that you use to administrate the domain, in particular the sequencer and mediator, to a Canton version with the bug fix.
rotate_node_keys
command when it is used to rotate keys of sequencer(s) and mediator(s)Canton has a series of console commands to rotate keys, in particular, rotate_node_keys
that is used
to rotate the keys of a node.
When we use the console macro rotate_node_keys
to rotate the keys of a sequencer or mediator the keys are
actually not rotated because we are: (1) looking at the wrong topology management store and (2) not using the
associated domainManager to actually rotate the keys.
Mediator and Sequencer nodes.
The rotation of the keys for a sequencer or mediator will not succeed.
Select the correct topology management Active
store (the one from the domainManager) to look for keys to rotate and
call the rotate_key
console command with the domainManager reference.
Everytime we use rotate_node_keys
to rotate the keys of sequencer(s) or mediator(s).
Upgrade the Canton console that you use to administrate the domain, in particular the sequencer and mediator, to a Canton version with the bug fix.
The Canton protocol includes in ViewParticipantData
contracts that required to re-interpret command evaluation. These inputs are known as coreInputs
. Included as part of this input contract data is contract metadata the includes details of signatories and stakeholders.
Where the evaluation of a command results in a contract key being resolved to a contract identifier but that identifier is not in turn resolved to a contract instance that distributed metadata associated with contract will incorrectly have the key maintainers as both the signatory and the stakeholders. A way to do this would be to execute a choice on a contract other than the keyed contract that only issues a lookupByKey
on the keyed contract.
This problem is observed in canton protocol version 4 and fixed in version 5.
The impact of this bug is invalid contract metadata is distributed with transaction views.
A workaround would be to ensure that whenever a contract key is resolved to a contract identifier that identifier is always resolved to a contract (even if not needed). For example following the lookupByKey
if the case where a contract identifier is returned the issue a fetch
command on this identifier, discarding the result.
Unlikely. Most of the time a contract key is resolved to a contract so that some action can be performed on that contract. In this situation the metadata would be correct. The only situation this has occurred observed is in test scenarios.
The recommendation is that no action is required. The problem will be naturally fixed in protocol version 5 and the workaround above can be used in the unlikely case it is observed.
Transaction view decomposition is the process of taking a transaction and generating a view hierarchy whereby each view has a common set of informees. Each view additionally has a rollback scope. A child view having a rollback scope that is different from that of the parent indicates that any changes to the liveness of contracts that occurred within the child view should be disregarded upon exiting the child view. For example it would be valid for a contract to be consumed in a rolled back child view and then consumed again in a subsequent child view.
As the activeness of contracts is preserved across views with the same rollback scope every source transaction rollback node should be allocated a unique scope. In certain circumstances this is not happening resulting in contract activeness inconsistency.
This problem is observed in canton protocol version 4 and fixed in version 5.
The impact of this bug is that a inconsistent transaction view hierarchy can be generated from a consistent transaction. This in turn can result in a valid transaction being rejected. The symptoms of this would be the mediator rejecting a valid transaction request on the basis of inconsistency.
This bug only effects transactions that contain rollbacks, if use of rollbacks can be avoided this bug will not occur.
To encounter this bug requires a transaction that has multiple rolled back nodes in which overlapping contracts and/or keys are used. For this reason the likelihood of encountering the bug is low.
If this bug was to effect a client development then the best course of action would be to finalize and release canton protocol 5.
In some cases, participant state and mediator domain state topology transactions were silently ignored when they were sent as part of a cascading topology update (which means they were sent together with a namespace certificate). As a result, the nodes had a different view on the topology state and not all daml transactions could be run.
All nodes
A participant node might consider another participant node as inactive and therefore refuse to send transactions or invalidate transactions.
A Daml transaction might be rejected with UNKNOWN_INFORMEES
.
Flush the topology state by running domain.participants.set_state(pid, Submission, Vip (and back to Ordinary))
. This will run the update through the "incremental" update code path that is behaving correctly, which fixes the topology state of the broken node.
The bug is deterministic and can occur when using permissioned domains, when the participant state is received together with the namespace delegation of the domain but without the namespace delegation of the participant.
Upgrade to this version if you intend to use permissioned domains. If you need to fix a broken system, then upgrade to a version fixing the issue and apply the work-around to ""flush"" the topology state.
After an Indexer restart in the Ledger API or any error causing the client transaction streams to fail, the PingService stops working.
The participant node
When the Ledger API encounters an error that leads to cancelling the client connections while the participant node does not become passive, the PingService cannot continue processing commands.
Ping commands issued in the PingService are timing out.
Additionally, the participant might appear unhealthy if configured to report health
by using the PingService (i.e. configured with monitoring.health.check.type = ping
).
Restart the participant node.
This bug occurs consistently when there is an error in the Ledger API, such as a DB overloaded issue that causes the Ledger API Indexer to restart. For this bug to occur, the participant node must not transition to passive state. If it transitions to passive and then back to active, the bug should not reproduce.
If the system is subject to frequent transient errors in the Ledger API (e.g. flaky Index database) or consistently high load, update to this version in order to avoid reproducibility.
Participant may shut down ungracefully if there are still completing CommandService submissions or, in extreme cases, the Ledger API can restart during normal operations.
The participant node
No significant operational impact.
IllegalStateException("Promise already completed.")
is logged.Not applicable as the effect is mostly aesthetic.
This ungraceful shutdown is only likely under heavy usage of CommandService at the same time with the participant shutdown. The likelihood of this bug triggering a Ledger API restart is very small as multiple conditions need to be met:
Upgrade when convenient.
The CommandService errors out when confronted with an expired gRPC request deadline.
The participant node
If encountered repeatedly (up to the maximum-in-flight configuration limit for the CommandService), the CommandService can appear saturated and reject new commands.
When a command request is submitted via CommandService.submitAndWait and its variants with providing a gRPC request deadline, the request can fail with an INTERNAL error reported to the client and a log message with ERROR level is logged on the participant.
Restart the participant and use a higher gRPC request deadline.
This bug is likely to happen if the gRPC request deadline is small enough for it to expire upon arriving at the participant's Ledger API.
Upgrade when convenient.
A Ledger API restart following a pruning event can lead to a non-functioning PingService
The participant node
The PingService is not functional after this error is encountered.
PARTICIPANT_PRUNED_DATA_ACCESSED
every one second.Restart the participant node.
This bug is likely when in the same participant active session, a pruning request is followed by Ledger API restart (caused by transient errors such as a DB connectivity). For this bug to occur, the participant node must not transition to passive state. If it transitions to passive and then back to active, the bug should not reproduce.
Upgrade when convenient.
The following Canton protocol versions are supported:
Dependency | Version |
---|---|
Canton protocol versions | 3, 4, 5 |
Canton has been tested against the following versions of its dependencies:
Dependency | Version |
---|---|
Java Runtime | OpenJDK 64-Bit Server VM Zulu11.66+15-CA (build 11.0.20+8-LTS, mixed mode) |
Postgres | Recommended: PostgreSQL 12.15 (Debian 12.15-1.pgdg120+1) – Also tested: PostgreSQL 11.16 (Debian 11.16-1.pgdg90+1), PostgreSQL 13.12 (Debian 13.12-1.pgdg120+1), PostgreSQL 14.8 (Debian 14.8-1.pgdg120+1), PostgreSQL 15.3 (Debian 15.3-1.pgdg120+1) |
Oracle | 19.20.0 |
Note: We no longer test with Postgres 10 as its support has been discontinued.
We are currently working on