Emqx Versions Save

The most scalable open-source MQTT broker for IoT, IIoT, and connected vehicles

v5.6.1

1 month ago

Bug Fixes

  • #12759 EMQX now automatically removes invalid backup files that fail during upload due to schema validation errors. This fix ensures that only valid configuration files are displayed and stored, enhancing system reliability.

  • #12766 Renamed message_queue_too_long error reason to mailbox_overflow

    mailbox_overflow. The latter is consistent with the corresponding config parameter: force_shutdown.max_mailbox_size.

  • #12773 Upgraded HTTP client libraries.

    The HTTP client library (gun-1.3) incorrectly appended a :portnumber suffix to the Host header for standard ports (http on port 80, https on port 443). This could cause compatibility issues with servers or gateways performing strict Host header checks (e.g., AWS Lambda, Alibaba Cloud HTTP gateways), leading to errors such as InvalidCustomDomain.NotFound or "The specified CustomDomain does not exist."

  • #12802 Improved how EMQX handles node removal from clusters via the emqx ctl cluster leave command. Previously, nodes could unintentionally rejoin the same cluster (unless it was stopped) if the configured cluster discovery_strategy was not manual. With the latest update, executing the cluster leave command now automatically disables cluster discovery for the node, preventing it from rejoining. To re-enable cluster discovery, use the emqx ctl discovery enable command or simply restart the node.

  • #12814 Improved error handling for the /clients/{clientid}/mqueue_messages and /clients/{clientid}/inflight_messages APIs in EMQX. These updates address:

    • Internal Timeout: If EMQX fails to retrieve the list of Inflight or Mqueue messages within the default 5-second timeout, likely under heavy system load, the API will return 500 error with the response {"code":"INTERNAL_ERROR","message":"timeout"}, and log additional details for troubleshooting.
    • Client Shutdown: Should the client connection be terminated during an API call, the API now returns a 404 error, with the response {"code": "CLIENT_SHUTDOWN", "message": "Client connection has been shutdown"}. This ensures clearer feedback when client connections are interrupted.
  • #12824 Updated the statistics metrics subscribers.count and subscribers.max to include shared subscribers. Previously, these metrics accounted only for non-shared subscribers.

  • #12826 Fixed issues related to the import functionality of source data integrations and retained messages in EMQX. Before this update:

    • The data integration sources specified in backup files were not being imported. This included configurations under the sources.mqtt category with specific connectors and parameters such as QoS and topics.
    • Importing the mnesia table for retained messages was not supported.
  • #12843 Fixed cluster_rpc_commit transaction ID cleanup procedure on replicator nodes after executing the emqx ctl cluster leave command. Previously, failing to properly clear these transaction IDs impeded configuration updates on the core node.

  • #12885 Fixed an issue in EMQX where users were unable to view "Retained Messages" under the "Monitoring" menu in the Dashboard.

    The "Retained messages" backend API uses the qlc library. This problem was due to a permission issue where the qlc library's file_sorter function tried to use a non-writable directory, /opt/emqx, to store temporary files, resulting from recent changes in directory ownership permissions in Docker deployments.

    This update modifies the ownership settings of the /opt/emqx directory to emqx:emqx, ensuring that all necessary operations, including retained messages retrieval, can proceed without access errors.

e5.6.1

1 month ago

Bug Fixes

  • #12759 EMQX now automatically removes invalid backup files that fail during upload due to schema validation errors. This fix ensures that only valid configuration files are displayed and stored, enhancing system reliability.

  • #12766 Renamed message_queue_too_long error reason to mailbox_overflow

    mailbox_overflow. The latter is consistent with the corresponding config parameter: force_shutdown.max_mailbox_size.

  • #12773 Upgraded HTTP client libraries.

    The HTTP client library (gun-1.3) incorrectly appended a :portnumber suffix to the Host header for standard ports (http on port 80, https on port 443). This could cause compatibility issues with servers or gateways performing strict Host header checks (e.g., AWS Lambda, Alibaba Cloud HTTP gateways), leading to errors such as InvalidCustomDomain.NotFound or "The specified CustomDomain does not exist."

  • #12802 Improved how EMQX handles node removal from clusters via the emqx ctl cluster leave command. Previously, nodes could unintentionally rejoin the same cluster (unless it was stopped) if the configured cluster discovery_strategy was not manual. With the latest update, executing the cluster leave command now automatically disables cluster discovery for the node, preventing it from rejoining. To re-enable cluster discovery, use the emqx ctl discovery enable command or simply restart the node.

  • #12814 Improved error handling for the /clients/{clientid}/mqueue_messages and /clients/{clientid}/inflight_messages APIs in EMQX. These updates address:

    • Internal Timeout: If EMQX fails to retrieve the list of Inflight or Mqueue messages within the default 5-second timeout, likely under heavy system load, the API will return 500 error with the response {"code":"INTERNAL_ERROR","message":"timeout"}, and log additional details for troubleshooting.
    • Client Shutdown: Should the client connection be terminated during an API call, the API now returns a 404 error, with the response {"code": "CLIENT_SHUTDOWN", "message": "Client connection has been shutdown"}. This ensures clearer feedback when client connections are interrupted.
  • #12824 Updated the statistics metrics subscribers.count and subscribers.max to include shared subscribers. Previously, these metrics accounted only for non-shared subscribers.

  • #12826 Fixed issues related to the import functionality of source data integrations and retained messages in EMQX. Before this update:

    • The data integration sources specified in backup files were not being imported. This included configurations under the sources.mqtt category with specific connectors and parameters such as QoS and topics.
    • Importing the mnesia table for retained messages was not supported.
  • #12843 Fixed cluster_rpc_commit transaction ID cleanup procedure on replicator nodes after executing the emqx ctl cluster leave command. Previously, failing to properly clear these transaction IDs impeded configuration updates on the core node.

  • #12882 Fixed an issue with the RocketMQ action in EMQX data integration, ensuring that messages are correctly routed to their configured topics. Previously, when multiple actions shared a single RocketMQ connector, all messages were mistakenly sent to the topic configured for the first action. This fix involves starting a distinct set of RocketMQ workers for each topic, preventing cross-topic message delivery errors.

  • #12885 Fixed an issue in EMQX where users were unable to view "Retained Messages" under the "Monitoring" menu in the Dashboard.

    The "Retained messages" backend API uses the qlc library. This problem was due to a permission issue where the qlc library's file_sorter function tried to use a non-writable directory, /opt/emqx, to store temporary files, resulting from recent changes in directory ownership permissions in Docker deployments.

    This update modifies the ownership settings of the /opt/emqx directory to emqx:emqx, ensuring that all necessary operations, including retained messages retrieval, can proceed without access errors.

v5.6.0

1 month ago

Enhancements

  • #12251 Optimized the performance of the RocksDB-based persistent sessions, achieving a reduction in RAM usage and database request frequency. Key improvements include:

    • Introduced dirty session state to avoid frequent mria transactions.
    • Introduced an intermediate buffer for the persistent messages.
    • Used separate tracks of PacketIds for QoS1 and QoS2 messages.
    • Limited the number of continuous ranges of inflight messages to 1 per stream.
  • #12326 Enhanced session tracking with registration history. EMQX now has the capability to monitor the history of session registrations, including those that have expired. By configuring broker.session_history_retain, EMQX retains records of expired sessions for a specified duration.

    • Session count API: Use the API GET /api/v5/sessions_count?since=1705682238 to obtain a count of sessions across the cluster that remained active since the given UNIX epoch timestamp (with seconds precision). This enhancement aids in analyzing session activity over time.

    • Metrics expansion with cluster sessions gauge: A new gauge metric, cluster_sessions, is added to better track the number of sessions within the cluster. This metric is also integrated into Prometheus for easy monitoring:

      # TYPE emqx_cluster_sessions_count gauge
      emqx_cluster_sessions_count 1234
      

      NOTE: Please consider this metric as an approximate estimation. Due to the asynchronous nature of data collection and calculation, exact precision may vary.

  • #12338 Introduced a time-based garbage collection mechanism to the RocksDB-based persistent session backend. This feature ensures more efficient management of stored messages, optimizing storage utilization and system performance by automatically purging outdated messages.

  • #12398 Exposed the swagger_support option in the Dashboard configuration, allowing for the enabling or disabling of the Swagger API documentation.

  • #12467 Started supporting cluster discovery using AAAA DNS record type.

  • #12483 Renamed emqx ctl conf cluster_sync tnxid ID to emqx ctl conf cluster_sync inspect ID.

    For backward compatibility, tnxid is kept, but considered deprecated and will be removed in 5.7.

  • #12499 Enhanced client banning capabilities with extended rules, including:

    • Matching clientid against a specified regular expression.
    • Matching client's username against a specified regular expression.
    • Matching client's peer address against a CIDR range.

    Important Notice: Implementing a large number of broad matching rules (not specific to an individual clientid, username, or host) may affect system performance. It's advised to use these extended ban rules judiciously to maintain optimal system efficiency.

  • #12509 Implemented API to re-order all authenticators / authorization sources.

  • #12517 Configuration files have been upgraded to accommodate multi-line string values, preserving indentation for enhanced readability and maintainability. This improvement utilizes """~ and ~""" markers to quote indented lines, offering a structured and clear way to define complex configurations. For example:

    rule_xlu4 {
      sql = """~
        SELECT
          *
        FROM
          "t/#"
      ~"""
    }
    

    See HOCON 0.42.0 release notes for details.

  • #12520 Implemented log throttling. The feature reduces the volume of logged events that could potentially flood the system by dropping all but the first occurance of an event within a configured time window. Log throttling is applied to the following log events that are critical yet prone to repetition:

    • authentication_failure
    • authorization_permission_denied
    • cannot_publish_to_topic_due_to_not_authorized
    • cannot_publish_to_topic_due_to_quota_exceeded
    • connection_rejected_due_to_license_limit_reached
    • dropped_msg_due_to_mqueue_is_full
  • #12561 Implemented HTTP APIs to get the list of client's in-flight and message queue (mqueue) messages. These APIs facilitate detailed insights and effective control over message queues and in-flight messaging, ensuring efficient message handling and monitoring.

    To get the first chunk of data:

    • GET /clients/{clientid}/mqueue_messages?limit=100
    • GET /clients/{clientid}/inflight_messages?limit=100

    Alternatively, for the first chunks without specifying a start position:

    • GET /clients/{clientid}/mqueue_messages?limit=100&position=none
    • GET /clients/{clientid}/inflight_messages?limit=100&position=none

    To get the next chunk of data:

    • GET /clients/{clientid}/mqueue_messages?limit=100&position={position}
    • GET /clients/{clientid}/inflight_messages?limit=100&position={position}

    Where {position} is a value (opaque string token) of meta.position field from the previous response.

    Ordering and Prioritization:

    • Mqueue Messages: These are prioritized and sequenced based on their queue order (FIFO), from higher to lower priority. By default, mqueue messages carry a uniform priority level of 0.
    • In-Flight Messages: Sequenced by the timestamp of their insertion into the in-flight storage, from oldest to newest.
  • #12590 Removed mfa meta data from log messages to improve clarity.

  • #12641 Improved text log formatter fields order. The new fields order is as follows:

    tag > clientid > msg > peername > username > topic > [other fields]

  • #12670 Added field shared_subscriptions to endpoint /monitor_current and /monitor_current/nodes/:node.

  • #12679 Upgraded docker image base from Debian 11 to Debian 12.

  • #12700 Started supporting "b" and "B" unit in bytesize hocon fields. For example, all three fields below will have the value of 1024 bytes:

    bytesize_field = "1024b"
    bytesize_field2 = "1024B"
    bytesize_field2 = 1024
    
  • #12719 The /clients API has been upgraded to accommodate queries for multiple clientids and usernames simultaneously, offering a more flexible and powerful tool for monitoring client connections. Additionally, this update introduces the capability to customize which client information fields are included in the API response, optimizing for specific monitoring needs.

    Examples of Multi-Client/Username Queries:

    • To query multiple clients by ID: /clients?clientid=client1&clientid=client2
    • To query multiple users: /clients?username=user11&username=user2
    • To combine multiple client IDs and usernames in one query: /clients?clientid=client1&clientid=client2&username=user1&username=user2

    Examples of Selecting Fields for the Response:

    • To include all fields in the response: /clients?fields=all (Note: Omitting the fields parameter defaults to returning all fields.)
    • To specify only certain fields: /clients?fields=clientid,username
  • #12381 Added new SQL functions: map_keys(), map_values(), map_to_entries(), join_to_string(), join_to_string(), join_to_sql_values_string(), is_null_var(), is_not_null_var().

    For more information on the functions and their usage, refer to Built-in SQL Functions the documentation.

  • #12336 Performance enhancement. Created a dedicated async task handler pool to handle client session cleanup tasks.

  • #12725 Implemented REST API to list the available source types.

  • #12746 Added username log field. If MQTT client is connected with a non-empty username the logs and traces will include username field.

  • #12785 Added timestamp_format configuration option to log handlers. This new option allows for the following settings:

    • auto: Automatically determines the timestamp format based on the log formatter being used. Utilizes rfc3339 format for text formatters, and epoch format for JSON formatters.

    • epoch: Represents timestamps in microseconds precision Unix epoch format.

    • rfc3339: Uses RFC3339 compliant format for date-time strings. For example, 2024-03-26T11:52:19.777087+00:00.

Bug Fixes

  • #11868 Fixed a bug where will messages were not published after session takeover.

  • #12347 Implemented an update to ensure that messages processed by the Rule SQL for the MQTT egress data bridge are always rendered as valid, even in scenarios where the data is incomplete or lacks certain placeholders defined in the bridge configuration. This adjustment prevents messages from being incorrectly deemed invalid and subsequently discarded by the MQTT egress data bridge, as was the case previously.

    When variables in payload and topic templates are undefined, they are now rendered as empty strings instead of the literal undefined string.

  • #12472 Fixed an issue where certain read operations on /api/v5/actions/ and /api/v5/sources/ endpoints might result in a 500 error code during the process of rolling upgrades.

  • #12492 EMQX now returns the Receive-Maximum property in the CONNACK message for MQTT v5 clients, aligning with protocol expectations. This implementation considers the minimum value of the client's Receive-Maximum setting and the server's max_inflight configuration as the limit for the number of inflight (unacknowledged) messages permitted. Previously, the determined value was not sent back to the client in the CONNACK message.

  • #12500 The GET /clients and GET /client/:clientid HTTP APIs have been updated to include disconnected persistent sessions in their responses.

    NOTE: A current known issue with these enhanced API responses is that the total client count provided may exceed the actual number of clients due to the inclusion of disconnected sessions.

  • #12513 Changed the level of several flooding log events from warning to info.

  • #12530 Improved the error reporting for frame_too_large events and malformed CONNECT packet parsing failures. These updates now provide additional information, aiding in the troubleshooting process.

  • #12541 Introduced a new configuration validation step for autocluster by DNS records to ensure compatibility between node.name and cluster.discover_strategy. Specifically, when utilizing the dns strategy with either a or aaaa record types, it is mandatory for all nodes to use a (static) IP address as the host name.

  • #12562 Added a new configuration root: durable_storage. This configuration tree contains the settings related to the new persistent session feature.

  • #12566 Enhanced the bootstrap file for REST API keys:

    • Empty lines within the file are now skipped, eliminating the previous behavior of generating an error.

    • API keys specified in the bootstrap file are assigned the highest precedence. In cases where a new key from the bootstrap file conflicts with an existing key, the older key will be automatically removed to ensure that the bootstrap keys take effect without issue.

  • #12646 Fixed an issue with the rule engine's date-time string parser. Previously, time zone adjustments were only effective for date-time strings specified with second-level precision.

  • #12652 Fixed a discrepancy where the subbits functions with 4 and 5 parameters, despite being documented, were missing from the actual implementation. These functions have now been added.

  • #12663 Fixed an issue where the emqx_vm_cpu_use and emqx_vm_cpu_idle metrics, accessible via the Prometheus endpoint /prometheus/stats, were inaccurately reflecting the average CPU usage since the operating system boot. This fix ensures that these metrics now accurately represent the current CPU usage and idle, providing more relevant and timely data for monitoring purposes.

  • #12668 Refactored the SQL function date_to_unix_ts() by using calendar:datetime_to_gregorian_seconds/1. This change also added validation for the input date format.

  • #12672 Changed the process for generating the node boot configuration by incorporating the loading of {data_dir}/configs/cluster.hocon. Previously, changes to logging configurations made via the Dashboard and saved in {data_dir}/configs/cluster.hocon were only applied after the initial boot configuration was generated using etc/emqx.conf, leading to potential loss of some log segment files due to late reconfiguration.

    Now, both {data_dir}/configs/cluster.hocon and etc/emqx.conf are loaded concurrently, with settings from emqx.conf taking precedence, to create the boot configuration.

  • #12696 Fixed an issue where attempting to reconnect an action or source could lead to wrong error messages being returned in the HTTP API.

  • #12714 Fixed inaccuracies in several metrics reported by the /prometheus/stats endpoint of the Prometheus API. The correction applies to the following metrics:

    • emqx_cluster_sessions_count
    • emqx_cluster_sessions_max
    • emqx_cluster_nodes_running
    • emqx_cluster_nodes_stopped
    • emqx_subscriptions_shared_count
    • emqx_subscriptions_shared_max

    Additionally, this fix rectified an issue within the /stats endpoint concerning the subscriptions.shared.count and subscriptions.shared.max fields. Previously, these values failed to update promptly following a client's disconnection or unsubscription from a Shared-Subscription.

  • #12715 Fixed a crash that could occur during configuration updates if the connector for the ingress data integration source had active channels.

  • #12740 Fixed an issue when durable sessions could not be kicked out.

  • #12768 Addressed a startup failure issue in EMQX version 5.4.0 and later, particularly noted during rolling upgrades from versions before 5.4.0. The issue was related to the initialization of the routing schema when both v1 and v2 routing tables were empty.

    The node now attempts to retrieve the routing schema version in use across the cluster instead of using the v2 routing table by default when local routing tables are found empty at startup. This approach mitigates potential conflicts and reduces the chances of diverging routing storage schemas among cluster nodes, especially in a mixed-version cluster scenario.

    If conflict is detected in a running cluster, EMQX writes instructions on how to manually resolve it in the log as part of the error message with critical severity. The same error message and instructions will also be written on standard error to make sure this message will not get lost even if no log handler is configured.

  • #12786 Added a strict check that prevents replicant nodes from connecting to core nodes running with a different version of EMQX application. This check ensures that during the rolling upgrades, the replicant nodes can only work when at least one core node is running the same EMQX release version.

Breaking changes

  • #12576 Starting from 5.6, the "Configuration Manual" document will no longer include the bridges config root.

    A bridge is now either action + connector for egress data integration, or source + connector for ingress data integration. Please note that the bridges config (in cluster.hocon) and the REST API path api/v5/bridges still works, but considered deprecated.

  • #12634 Triple-quote string values in HOCON config files no longer support escape sequence.

    The detailed information can be found in this pull request. Here is a summary of the impact on EMQX users:

    • EMQX 5.6 is the first version to generate triple-quote strings in cluster.hocon, meaning for generated configs, there is no compatibility issue.
    • For user hand-crafted configs (such as emqx.conf) a thorough review is needed to inspect if escape sequences are used (such as \n, \r, \t and \\), if yes, such strings should be changed to regular quotes (one pair of ") instead of triple-quotes.

e5.6.0

1 month ago

Enhancements

  • #12326 Enhanced session tracking with registration history. EMQX now has the capability to monitor the history of session registrations, including those that have expired. By configuring broker.session_history_retain, EMQX retains records of expired sessions for a specified duration.

    • Session count API: Use the API GET /api/v5/sessions_count?since=1705682238 to obtain a count of sessions across the cluster that remained active since the given UNIX epoch timestamp (with seconds precision). This enhancement aids in analyzing session activity over time.

    • Metrics expansion with cluster sessions gauge: A new gauge metric, cluster_sessions, is added to better track the number of sessions within the cluster. This metric is also integrated into Prometheus for easy monitoring:

      # TYPE emqx_cluster_sessions_count gauge
      emqx_cluster_sessions_count 1234
      

      NOTE: Please consider this metric as an approximate estimation. Due to the asynchronous nature of data collection and calculation, exact precision may vary.

  • #12398 Exposed the swagger_support option in the Dashboard configuration, allowing for the enabling or disabling of the Swagger API documentation.

  • #12467 Started supporting cluster discovery using AAAA DNS record type.

  • #12483 Renamed emqx ctl conf cluster_sync tnxid ID to emqx ctl conf cluster_sync inspect ID.

    For backward compatibility, tnxid is kept, but considered deprecated and will be removed in 5.7.

  • #12495 Introduced new AWS S3 connector and action.

  • #12499 Enhanced client banning capabilities with extended rules, including:

    • Matching clientid against a specified regular expression.
    • Matching client's username against a specified regular expression.
    • Matching client's peer address against a CIDR range.

    Important Notice: Implementing a large number of broad matching rules (not specific to an individual clientid, username, or host) may affect system performance. It's advised to use these extended ban rules judiciously to maintain optimal system efficiency.

  • #12509 Implemented API to re-order all authenticators / authorization sources.

  • #12517 Configuration files have been upgraded to accommodate multi-line string values, preserving indentation for enhanced readability and maintainability. This improvement utilizes """~ and ~""" markers to quote indented lines, offering a structured and clear way to define complex configurations. For example:

    rule_xlu4 {
      sql = """~
        SELECT
          *
        FROM
          "t/#"
      ~"""
    }
    

    See HOCON 0.42.0 release notes for details.

  • #12520 Implemented log throttling. The feature reduces the volume of logged events that could potentially flood the system by dropping all but the first occurance of an event within a configured time window. Log throttling is applied to the following log events that are critical yet prone to repetition:

    • authentication_failure
    • authorization_permission_denied
    • cannot_publish_to_topic_due_to_not_authorized
    • cannot_publish_to_topic_due_to_quota_exceeded
    • connection_rejected_due_to_license_limit_reached
    • dropped_msg_due_to_mqueue_is_full
  • #12561 Implemented HTTP APIs to get the list of client's in-flight and message queue (mqueue) messages. These APIs facilitate detailed insights and effective control over message queues and in-flight messaging, ensuring efficient message handling and monitoring.

    To get the first chunk of data:

    • GET /clients/{clientid}/mqueue_messages?limit=100
    • GET /clients/{clientid}/inflight_messages?limit=100

    Alternatively, for the first chunks without specifying a start position:

    • GET /clients/{clientid}/mqueue_messages?limit=100&position=none
    • GET /clients/{clientid}/inflight_messages?limit=100&position=none

    To get the next chunk of data:

    • GET /clients/{clientid}/mqueue_messages?limit=100&position={position}
    • GET /clients/{clientid}/inflight_messages?limit=100&position={position}

    Where {position} is a value (opaque string token) of meta.position field from the previous response.

    Ordering and Prioritization:

    • Mqueue Messages: These are prioritized and sequenced based on their queue order (FIFO), from higher to lower priority. By default, mqueue messages carry a uniform priority level of 0.
    • In-Flight Messages: Sequenced by the timestamp of their insertion into the in-flight storage, from oldest to newest.
  • #12590 Removed mfa meta data from log messages to improve clarity.

  • #12641 Improved text log formatter fields order. The new fields order is as follows:

    tag > clientid > msg > peername > username > topic > [other fields]

  • #12670 Added field shared_subscriptions to endpoint /monitor_current and /monitor_current/nodes/:node.

  • #12679 Upgraded docker image base from Debian 11 to Debian 12.

  • #12700 Started supporting "b" and "B" unit in bytesize hocon fields.

    For example, all three fields below will have the value of 1024 bytes:

    bytesize_field = "1024b"
    bytesize_field2 = "1024B"
    bytesize_field2 = 1024
    
  • #12719 The /clients API has been upgraded to accommodate queries for multiple clientids and usernames simultaneously, offering a more flexible and powerful tool for monitoring client connections. Additionally, this update introduces the capability to customize which client information fields are included in the API response, optimizing for specific monitoring needs.

    Examples of Multi-Client/Username Queries:

    • To query multiple clients by ID: /clients?clientid=client1&clientid=client2
    • To query multiple users: /clients?username=user11&username=user2
    • To combine multiple client IDs and usernames in one query: /clients?clientid=client1&clientid=client2&username=user1&username=user2

    Examples of Selecting Fields for the Response:

    • To include all fields in the response: /clients?fields=all (Note: Omitting the fields parameter defaults to returning all fields.)
    • To specify only certain fields: /clients?fields=clientid,username
  • #12330 The Cassandra bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12353 The OpenTSDB bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12376 The Kinesis bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12386 The GreptimeDB bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12423 The RabbitMQ bridge has been split into connector, action and source components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12425 The ClickHouse bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12439 The Oracle bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12449 The TDEngine bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12488 The RocketMQ bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12512 The HStreamDB bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically, however, it is recommended to do the upgrade manually as new fields have been added to the configuration.

  • #12543 The DynamoDB bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12595 The Kafka Consumer bridge has been split into connector and source components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12619 The Microsoft SQL Server bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12381 Added new SQL functions: map_keys(), map_values(), map_to_entries(), join_to_string(), join_to_string(), join_to_sql_values_string(), is_null_var(), is_not_null_var().

    For more information on the functions and their usage, refer to Built-in SQL Functions the documentation.

  • #12427 Introduced the capability to specify a limit on the number of Kafka partitions that can be used for Kafka data integration.

  • #12577 Updated the service_account_json field for both the GCP PubSub Producer and Consumer connectors to accept JSON-encoded strings. Now, it's possible to set this field to a JSON-encoded string. Using the previous format (a HOCON map) is still supported but not encouraged.

  • #12581 Added JSON schema to schema registry.

    JSON Schema supports Draft 03, Draft 04 and Draft 05.

  • #12602 Enhanced health checking for IoTDB connector, using its ping API instead of just checking for an existing socket connection.

  • #12336 Refined the approach to managing asynchronous tasks by segregating the cleanup of channels into its own dedicated pool. This separation addresses performance issues encountered during channels cleanup under conditions of high network latency, ensuring that such tasks do not impede the efficiency of other asynchronous operations, such as route cleanup.

  • #12725 Implemented REST API to list the available source types.

  • #12746 Added username log field. If MQTT client is connected with a non-empty username the logs and traces will include username field.

  • #12785 Added timestamp_format configuration option to log handlers. This new option allows for the following settings:

    • auto: Automatically determines the timestamp format based on the log formatter being used. Utilizes rfc3339 format for text formatters, and epoch format for JSON formatters.

    • epoch: Represents timestamps in microseconds precision Unix epoch format.

    • rfc3339: Uses RFC3339 compliant format for date-time strings. For example, 2024-03-26T11:52:19.777087+00:00.

Bug Fixes

  • #11868 Fixed a bug where will messages were not published after session takeover.

  • #12347 Implemented an update to ensure that messages processed by the Rule SQL for the MQTT egress data bridge are always rendered as valid, even in scenarios where the data is incomplete or lacks certain placeholders defined in the bridge configuration. This adjustment prevents messages from being incorrectly deemed invalid and subsequently discarded by the MQTT egress data bridge, as was the case previously.

    When variables in payload and topic templates are undefined, they are now rendered as empty strings instead of the literal undefined string.

  • #12472 Fixed an issue where certain read operations on /api/v5/actions/ and /api/v5/sources/ endpoints might result in a 500 error code during the process of rolling upgrades.

  • #12492 EMQX now returns the Receive-Maximum property in the CONNACK message for MQTT v5 clients, aligning with protocol expectations. This implementation considers the minimum value of the client's Receive-Maximum setting and the server's max_inflight configuration as the limit for the number of inflight (unacknowledged) messages permitted. Previously, the determined value was not sent back to the client in the CONNACK message.

    NOTE: A current known issue with these enhanced API responses is that the total client count provided may exceed the actual number of clients due to the inclusion of disconnected sessions.

  • #12505 Upgraded the Kafka producer client wolff from version 1.10.1 to 1.10.2. This latest version maintains a long-lived metadata connection for each connector, optimizing EMQX's performance by reducing the frequency of establishing new connections for action and connector health checks.

  • #12513 Changed the level of several flooding log events from warning to info.

  • #12530 Improved the error reporting for frame_too_large events and malformed CONNECT packet parsing failures. These updates now provide additional information, aiding in the troubleshooting process.

  • #12541 Introduced a new configuration validation step for autocluster by DNS records to ensure compatibility between node.name and cluster.discover_strategy. Specifically, when utilizing the dns strategy with either a or aaaa record types, it is mandatory for all nodes to use a (static) IP address as the host name.

  • #12566 Enhanced the bootstrap file for REST API keys:

    • Empty lines within the file are now skipped, eliminating the previous behavior of generating an error.

    • API keys specified in the bootstrap file are assigned the highest precedence. In cases where a new key from the bootstrap file conflicts with an existing key, the older key will be automatically removed to ensure that the bootstrap keys take effect without issue.

  • #12646 Fixed an issue with the rule engine's date-time string parser. Previously, time zone adjustments were only effective for date-time strings specified with second-level precision.

  • #12652 Fixed a discrepancy where the subbits functions with 4 and 5 parameters, despite being documented, were missing from the actual implementation. These functions have now been added.

  • #12663 Fixed an issue where the emqx_vm_cpu_use and emqx_vm_cpu_idle metrics, accessible via the Prometheus endpoint /prometheus/stats, were inaccurately reflecting the average CPU usage since the operating system boot. This fix ensures that these metrics now accurately represent the current CPU usage and idle, providing more relevant and timely data for monitoring purposes.

  • #12668 Refactored the SQL function date_to_unix_ts() by using calendar:datetime_to_gregorian_seconds/1. This change also added validation for the input date format.

  • #12672 Changed the process for generating the node boot configuration by incorporating the loading of {data_dir}/configs/cluster.hocon. Previously, changes to logging configurations made via the Dashboard and saved in {data_dir}/configs/cluster.hocon were only applied after the initial boot configuration was generated using etc/emqx.conf, leading to potential loss of some log segment files due to late reconfiguration.

    Now, both {data_dir}/configs/cluster.hocon and etc/emqx.conf are loaded concurrently, with settings from emqx.conf taking precedence, to create the boot configuration.

  • #12696 Fixed an issue where attempting to reconnect an action or source could lead to wrong error messages being returned in the HTTP API.

  • #12714 Fixed inaccuracies in several metrics reported by the /prometheus/stats endpoint of the Prometheus API. The correction applies to the following metrics:

    • emqx_cluster_sessions_count
    • emqx_cluster_sessions_max
    • emqx_cluster_nodes_running
    • emqx_cluster_nodes_stopped
    • emqx_subscriptions_shared_count
    • emqx_subscriptions_shared_max

    Additionally, this fix rectified an issue within the /stats endpoint concerning the subscriptions.shared.count and subscriptions.shared.max fields. Previously, these values failed to update promptly following a client's disconnection or unsubscription from a Shared-Subscription.

  • #12390 Fixed an issue where the /license API request may crash during cluster joining processes.

  • #12411 Fixed a bug where null values would be inserted as 1853189228 in int columns in Cassandra data integration.

  • #12522 Refined the parsing process for Kafka bootstrap hosts to exclude spaces following commas, addressing connection timeouts and DNS resolution failures due to malformed host entries.

  • #12656 Implemented a topic verification step for creating GCP PubSub Producer actions, ensuring failure notifications when the topic doesn't exist or provided credentials lack sufficient permissions.

  • #12678 Enhanced the DynamoDB connector to clearly report the reason for connection failures, improving upon the previous lack of error insights.

  • #12681 Fixed a security issue where secrets could be logged at debug level when sending messages to a RocketMQ bridge/action.

  • #12715 Fixed a crash that could occur during configuration updates if the connector for the ingress data integration source had active channels.

  • #12767 Fixed issues encountered during upgrades from 5.0.1 to 5.5.1, specifically related to Kafka Producer configurations that led to upgrade failures. The correction ensures that Kafka Producer configurations are accurately transformed into the new action and connector configuration format required by EMQX version 5.5.1 and beyond.

  • #12768 Addressed a startup failure issue in EMQX version 5.4.0 and later, particularly noted during rolling upgrades from versions before 5.4.0. The issue was related to the initialization of the routing schema when both v1 and v2 routing tables were empty.

    The node now attempts to retrieve the routing schema version in use across the cluster instead of using the v2 routing table by default when local routing tables are found empty at startup. This approach mitigates potential conflicts and reduces the chances of diverging routing storage schemas among cluster nodes, especially in a mixed-version cluster scenario.

    If conflict is detected in a running cluster, EMQX writes instructions on how to manually resolve it in the log as part of the error message with critical severity. The same error message and instructions will also be written on standard error to make sure this message will not get lost even if no log handler is configured.

  • #12786 Added a strict check that prevents replicant nodes from connecting to core nodes running with a different version of EMQX application. This check ensures that during the rolling upgrades, the replicant nodes can only work when at least one core node is running the same EMQX release version.

Breaking changes

  • #12576 Starting from 5.6, the "Configuration Manual" document will no longer include the bridges config root.

    A bridge is now either action + connector for egress data integration, or source + connector for ingress data integration. Please note that the bridges config (in cluster.hocon) and the REST API path api/v5/bridges still works, but considered deprecated.

  • #12634 Triple-quote string values in HOCON config files no longer support escape sequence.

    The detailed information can be found in this pull request. Here is a summary of the impact on EMQX users:

    • EMQX 5.6 is the first version to generate triple-quote strings in cluster.hocon, meaning for generated configs, there is no compatibility issue.
    • For user hand-crafted configs (such as emqx.conf) a thorough review is needed to inspect if escape sequences are used (such as \n, \r, \t and \\), if yes, such strings should be changed to regular quotes (one pair of ") instead of triple-quotes.

e5.5.1

2 months ago

Enhancements

  • #12497 Improved MongoDB connector performance, resulting in more efficient database interactions. This enhancement is supported by improvements in the MongoDB Erlang driver as well (mongodb-erlang PR).

Bug Fixes

  • #12471 Fixed an issue that data integration configurations failed to load correctly during upgrades from EMQX version 5.0.2 to newer releases.

  • #12598 Fixed an issue that users were unable to subscribe to or unsubscribe from shared topic filters via HTTP API.

    The affected APIs include:

    • /clients/:clientid/subscribe

    • /clients/:clientid/subscribe/bulk

    • /clients/:clientid/unsubscribe

    • /clients/:clientid/unsubscribe/bulk

  • #12601 Fixed an issue where logs of the LDAP driver were not being captured. Now, all logs are recorded at the info level.

  • #12606 The Prometheus API experienced crashes when the specified SSL certificate file did not exist in the given path. Now, when an SSL certificate file is missing, the emqx_cert_expiry_at metric will report a value of 0, indicating the non-existence of the certificate.

  • #12608 Fixed a function_clause error in the IoTDB action caused by the absence of a payload field in query data.

  • #12610 Fixed an issue where connections to the LDAP connector could unexpectedly disconnect after a certain period of time.

  • #12620 Redacted sensitive information in HTTP headers to exclude authentication and authorization credentials from debug level logs in the HTTP Server connector, mitigating potential security risks.

  • #12632 Fixed an issue where the rule engine's SQL built-in function date_to_unix_ts produced incorrect results for dates starting from March 1st on leap years.

v5.5.1

2 months ago

5.5.1

Bug Fixes

  • #12471 Fixed an issue that data integration configurations failed to load correctly during upgrades from EMQX version 5.0.2 to newer releases.

  • #12598 Fixed an issue that users were unable to subscribe to or unsubscribe from shared topic filters via HTTP API.

    The affected APIs include:

    • /clients/:clientid/subscribe

    • /clients/:clientid/subscribe/bulk

    • /clients/:clientid/unsubscribe

    • /clients/:clientid/unsubscribe/bulk

  • #12601 Fixed an issue where logs of the LDAP driver were not being captured. Now, all logs are recorded at the info level.

  • #12606 The Prometheus API experienced crashes when the specified SSL certificate file did not exist in the given path. Now, when an SSL certificate file is missing, the emqx_cert_expiry_at metric will report a value of 0, indicating the non-existence of the certificate.

  • #12620 Redacted sensitive information in HTTP headers to exclude authentication and authorization credentials from debug level logs in the HTTP Server connector, mitigating potential security risks.

  • #12632 Fixed an issue where the rule engine's SQL built-in function date_to_unix_ts produced incorrect timestamp results for dates starting from March 1st on leap years.

e5.6.0-alpha.2

2 months ago

v5.5.1-rc.1

3 months ago

e5.5.1-rc.1

3 months ago

v5.5.0

3 months ago

v5.5.0

Enhancements

  • #12085 EMQX has been upgraded to leverage the capabilities of OTP version 26.1.2-2. NOTE: Docker images are still built with OTP 25.3.2.

  • #12189 Enhanced the ACL claim format in EMQX JWT authentication for greater versatility. The updated format now supports an array structure, aligning more closely with the file-based ACL rules.

    For example:

    [
    {
      "permission": "allow",
      "action": "pub",
      "topic": "${username}/#",
      "qos": [0, 1],
      "retain": true
    },
    {
      "permission": "allow",
      "action": "sub",
      "topic": "eq ${username}/#",
      "qos": [0, 1]
    },
    {
      "permission": "deny",
      "action": "all",
      "topics": ["#"]
    }
    ]
    

    In this new format, the absence of a matching rule does not result in an automatic denial of the action. The authorization chain can advance to other configured authorizers if a match is not found in the JWT ACL. If no match is found throughout the chain, the final decision defers to the default permission set in authorization.no_match.

  • #12267 Added a new timeout parameter to the cluster/:node/invite interface, addressing the issue of default timeouts. The previously set 5-second default timeout often led to HTTP API call timeouts because joining an EMQX cluster usually requires more time.

    In addition, EMQX added a new API /cluster/:node/invite_async to support an asynchronous way to invite nodes to join the cluster and introduced a new cluster/invitation API to inspect the join status.

  • #12272 Introduced updates to the retain API in EMQX:

    • Added a new API DELETE /retainer/messages to clean all retained messages.
    • Added an optional topic filter parameter topic in the query string for the API GET /retainer/messages. For example, using a query string topic=t/1 filters the retained messages for a specific topic, improving the efficiency of message retrieval.
  • #12277 Added mqtt/delayed/messages/:topic API to remove delayed messages by topic name.

  • #12278 Adjusted the maximum pagination size for paginated APIs in the REST API from 3000 to 10000.

  • #12289 Authorization caching now supports the exclusion of specific topics. For the specified list of topics and topic filters, EMQX will not generate an authorization cache. The list can be set through the authorization.cache.excludes configuration item or via the Dashboard. For these specific topics, permission checks will always be conducted in real-time rather than relying on previous cache results, thus ensuring the timeliness of authorization outcomes.

  • #12329 Added broker.routing.batch_sync configuration item to enable a dedicated process pool that synchronizes subscriptions with the global routing table in batches, thus reducing the frequency of cross-node communication that can be slowed down by network latency. Processing multiple subscription updates collectively, not only accelerates synchronization between replica nodes and core nodes in a cluster but also reduces the load on the broker pool, minimizing the risk of overloading.

  • #12333 Added a tags field for actions and connectors. Similar to the description field (which is a free text annotation), tags can be used to annotate actions and connectors for filtering and grouping.

  • #12299 Exposed more metrics to improve observability:

    Montior API:

    • Added retained_msg_count field to /api/v5/monitor_current.
    • Added license_quota field to /api/v5/monitor_current
    • Added retained_msg_count and node_uptime fields to /api/v5/monitor_current/nodes/{node}.
    • Added retained_msg_count, license_quota and node_uptime fields to /api/v5/monitor_current/nodes/{node}.

    Prometheus API:

    • Added emqx_cert_expiry_at and emqx_license_expiry_at to /api/v5/prometheus/stats to display TLS listener certificate expiration time and license expiration time.
    • Added /api/v5/prometheus/auth endpoint to provide metrics such as execution count and running status for all authenticatiors and authorizators.
    • Added /api/v5/prometheus/data_integration endpoint to provide metrics such as execution count and status for all rules, actions, and connectors.

    Limitations: Prometheus push gateway only supports the content in /api/v5/prometheus/stats?mode=node.

    For more API details and metric type information, please see swagger api docs.

  • #12196 Improved network efficiency during routes cleanup.

    Previously, when a node was down, a delete operation for each route to that node must be exchanged between all the other live nodes. After this change, only one match and delete operation is exchanged between all live nodes, significantly reducing the number of necessary network packets and decreasing the load on the inter-cluster network. This optimization must be especially helpful for geo-distributed EMQX deployments where network latency can be significantly high.

  • #12354 The concurrent creation and updates of data integrations are now supported, significantly increasing operation speeds, such as when importing backup files.

Bug Fixes

  • #12232 Fixed an issue when cluster commit log table was not deleted after a node was forced to leave a cluster.

  • #12243 Fixed a family of subtle race conditions that could lead to inconsistencies in the global routing state.

  • #12269 Improved error handling in the /clients interface; now returns a 400 status with more detailed error messages, instead of a generic 500, for query string validation failures.

  • #12285 Updated the CoAP gateway to support short parameter names for slight savings in datagram size. For example, clientid=bar can be written as c=bar.

  • #12303 Fixed the message indexing in retainer. Previously, clients with wildcard subscriptions might receive irrelevant retained messages not matching their subscription topics.

  • #12305 Corrected an issue with incomplete client/connection information being passed into emqx_cm, which could lead to internal inconsistencies and affect memory usage and operations like node evacuation.

  • #12306 Fixed an issue preventing the connectivity test for the Connector from functioning correctly after updating the password parameter via the HTTP API.

  • #12359 Fixed an issue causing error messages when restarting a node configured with some types of data bridges. Additionally, these bridges were at risk of entering a failed state upon node restart, requiring a manual restart to restore functionality.

  • #12404 Fixed an issue where restarting a data integration with heavy message flow could lead to a stop in the collection of data integration metrics.