Confluent Kafka Python Versions Save

Confluent's Kafka Python Client

v1.6.0

3 years ago

v1.6.0

v1.6.0 is a feature release with the following features, fixes and enhancements:

  • Bundles librdkafka v1.6.0 which adds support for Incremental rebalancing, Sticky producer partitioning, Transactional producer scalabilty improvements, and much much more. See link to release notes below.
  • Rename asyncio.py example to avoid circular import (#945)
  • The Linux wheels are now built with manylinux2010 (rather than manylinux1) since OpenSSL v1.1.1 no longer builds on CentOS 5. Older Linux distros may thus no longer be supported, such as CentOS 5.
  • The in-wheel OpenSSL version has been updated to 1.1.1i.
  • Added Message.latency() to retrieve the per-message produce latency.
  • Added trove classifiers.
  • Consumer destructor will no longer trigger consumer_close(), consumer.close() must now be explicitly called if the application wants to leave the consumer group properly and commit final offsets.
  • Fix PY_SSIZE_T_CLEAN warning
  • Move confluent_kafka/ to src/ to avoid pytest/tox picking up the local dir
  • Added producer.purge() to purge messages in-queue/flight (@peteryin21, #548)
  • Added AdminClient.list_groups() API (@messense, #948)
  • Rename asyncio.py example to avoid circular import (#945)

confluent-kafka-python is based on librdkafka v1.6.0, see the librdkafka release notes for a complete list of changes, enhancements, fixes and upgrade considerations.

v1.5.0

3 years ago

Confluent's Python client for Apache Kafka

v1.5.0

v1.5.0 is a maintenance release with the following fixes and enhancements:

confluent-kafka-python is based on librdkafka v1.5.0, see the librdkafka release notes for a complete list of changes, enhancements, fixes and upgrade considerations.

v1.4.2

3 years ago

Confluent's Python client for Apache Kafka

v1.4.2 is a maintenance release with the following fixes and enhancements:

  • Fix produce/consume hang after partition goes away and comes back, such as when a topic is deleted and re-created (regression in v1.3.0).
  • Consumer: Reset the stored offset when partitions are un-assign()ed. This fixes the case where a manual offset-less commit() or the auto-committer would commit a stored offset from a previous assignment before a new message was consumed by the application
  • Seed random number generator used for random broker selection.
  • Multiple calls to Consumer.close will not raise RunTimeError (@mkmoisen , #678)
  • Module lookup failures raise ImportError (@overstre #786)
  • Fix SchemaRegistryClient basic auth configuration (@blown302 #853)
  • Don't send empty credentials to SR in Authorization Header (#863)
  • miscellaneous test cleanup and enhancements (@nicht #843 ,#863)

confluent-kafka-python is based on librdkafka v1.4.2, see the librdkafka v1.4.2 release notes for a complete list of changes, enhancements, fixes and upgrade considerations.

v1.4.0

4 years ago

Confluent's Python client for Apache Kafka

v1.4.0 is a feature release:

  • KIP-98: Transactional Producer API
  • KIP-345: Static consumer group membership (by @rnpridgeon)
  • KIP-511: Report client software name and version to broker
  • Generic Serde API (experimental)
  • New AvroSerializer and AvroDeserializer implementations including configurable subject name strategies.
  • JSON Schema support (For Schema Registry)
  • Protobuf support (For Schema Registry)

confluent-kafka-python is based on librdkafka v1.4.0, see the librdkafka v1.4.0 release notes for a complete list of changes, enhancements, fixes and upgrade considerations.

Transactional Producer API

Release v1.4.0 for confluent-kafka-python adds complete Exactly-Once-Semantics (EOS) functionality, supporting the idempotent producer (since v1.0.0), a transaction-aware consumer (since v1.2.0) and full producer transaction support (v1.4.0).

This enables developers to create Exactly-Once applications with Apache Kafka.

See the Transactions in Apache Kafka page for an introduction and check the transactions example.

Generic Serializer API

Release v1.4.0 introduces a new, experimental, API which adds serialization capabilities to Kafka Producer and Consumer. This feature provides the ability to configure Producer/Consumer key and value serializers/deserializers independently. Previously all serialization must be handled prior to calling Producer.produce and after Consumer.poll.

This release ships with 3 built-in, Java compatible, standard serializer and deserializer classes:

Name Type Format
Double float IEEE 764 binary64
Integer int int32
String Unicode bytes*

* The StringSerializer codec is configurable and supports any one of Python's standard encodings. If left unspecified 'UTF-8' will be used.

Additional serialization implementations are possible through the extension of the Serializer and Deserializer base classes.

See avro_producer.py and avro_consumer.py for example usage.

Avro, Protobuf and JSON Schema Serializers

Release v1.4.0 for confluent-kafka-python adds support for two new Schema Registry serialization formats with its Generic Serialization API; JSON and Protobuf. A new set of Avro Serialization classes have also been added to conform to the new API.

Format Serializer Example Deserializer Example
Avro avro_producer.py avro_consumer.py
JSON json_producer.py json_consumer.py
Protobuf protobuf_producer.py protobuf_consumer.py

Security fixes

Two security issues have been identified in the SASL SCRAM protocol handler:

  • The client nonce, which is expected to be a random string, was a static string.
  • If sasl.username and sasl.password contained characters that needed escaping, a buffer overflow and heap corruption would occur. This was protected, but too late, by an assertion.

Both of these issues are fixed in this release.

Enhancements

  • Bump OpenSSL to v1.0.2u
  • Bump monitoring-interceptors to v0.11.3

Fixes

General:

  • Remove unused variable from README example (@qzyse2017, #691)
  • Add delivery report to Avro example (#742)
  • Update asyncio example companion blog URL (@filippovitale, #760)

Schema Registry/Avro:

  • Trim trailing / from Schema Registry base URL (@IvanProdaiko94 , #749)
  • Make automatic Schema registration optional (@dpfeif, #718)
  • Bump Apache Avro to 1.9.2[.1] (@RyanSkraba, #779)
  • Correct the SchemaRegistry authentication for SASL_INHERIT (@abij, #733)

Also see the librdkafka v1.4.0 release notes for fixes to the underlying client implementation.

v1.3.0

4 years ago

Confluent's Python client for Apache Kafka

confluent-kafka-python is based on librdkafka v1.3.0, see the librdkafka v1.3.0 release notes for a complete list of changes, enhancements, fixes and upgrade considerations.

This is a feature release adding support for KIP-392 Fetch from follower, allowing a consumer to fetch messages from the closest replica to increase throughput and reduce cost.

Features

  • KIP-392 - Fetch messages from closest replica / follower (by @mhowlett)
  • Python 3.8 binary wheel support for OSX and Linux. Windows Python 3.8 binary wheels are not currently available.

Enhancements

  • New example using python3 and asyncio (by @mhowlett)
  • Add warnings for inconsistent security configuration.
  • Optimizations to hdr histogram (stats) rollover.
  • Print compression type per message-set when debug=msg
  • Various doc fixes, updates and enhancements (@edenhill , @mhowlett)

Fixes

  • Fix crash when new topic is not created. (@Mostafa Razavi,#725)
  • Fix stringer/repr for SerializerError class(@ferozed, #675)
  • Fix consumer_lag in stats when consuming from broker versions <0.11.0.0 (regression in librdkafka v1.2.0).
  • Properly handle new Kafka-framed SASL GSSAPI frame semantics on Windows (#2542). 
This bug was introduced in v1.2.0 and broke GSSAPI authentication on Windows.
  • Fix msgq (re)insertion code to avoid O(N^2) insert sort operations on retry (#2508).
 The msgq insert code now properly handles interleaved and overlapping message range inserts, which may occur during Producer retries for
 high-throughput applications.
  • Fix producer insert msgq regression in v1.2.1 (#2450).
  • Upgrade builtin lz4 to 1.9.2 (CVE-2019-17543, #2598).
  • Don't trigger error when broker hostname changes (#2591).
  • Less strict message.max.bytes check for individual messages (#993).
  • Don't call timespec_get() on OSX (since it was removed in recent XCode) by @maparent .
  • LZ4 is available from ProduceRequest 0, not 3 (fixes assert in #2480).
  • Address 12 code issues identified by Coverity static code analysis.

v1.2.0

4 years ago

Confluent's Python client for Apache Kafka

confluent-kafka-python is based on librdkafka v1.2.0, see the librdkafka v1.2.0 release notes for a complete list of changes, enhancements, fixes and upgrade considerations.

  • Transaction aware consumer (isolation.level=read_committed) implemented by @mhowlett.
  • Sub-millisecond buffering (linger.ms) on the producer.
  • Improved authentication errors (KIP-152)

Consumer-side transaction support

This release adds consumer-side support for transactions. In previous releases, the consumer always delivered all messages to the application, even those in aborted or not yet committed transactions. In this release, the consumer will by default skip messages in aborted transactions. This is controlled through the new isolation.level configuration property which defaults to read_committed (only read committed messages, filter out aborted and not-yet committed transactions), to consume all messages, including for aborted transactions, you may set this property to read_uncommitted to get the behaviour of previous releases. For consumers in read_committed mode, the end of a partition is now defined to be the offset of the last message of a successfully committed transaction (referred to as the 'Last Stable Offset'). For non-transactional messages there is no change from previous releases, they will always be read, but a consumer will not advance into a not yet committed transaction on the partition.

Upgrade considerations

  • linger.ms default was changed from 0 to 0.5 ms to promote some level of batching even with default settings.

New configuration properties

  • Consumer property isolation.level=read_committed ensures the consumer will only read messages from successfully committed producer transactions. Default is read_committed. To get the previous behaviour, set the property to read_uncommitted, which will read all messages produced to a topic, regardless if the message was part of an aborted or not yet committed transaction.

Enhancements

  • Cache FastAvro schema for improved Avro Serialization/Deserialization (@BBM89, #627)
  • Protocol decoding optimization, increasing consume performance.
  • Add CachedSchemaRegistry docs (@lowercase24 , #495)

Fixes

General:

  • Rate limit IO-based queue wakeups to linger.ms, this reduces CPU load and lock contention for high throughput producer applications. (#2509)
  • SSL: Use only hostname (not port) when valid broker hostname (by Hunter Jacksson)
  • SSL: Ignore OpenSSL cert verification results if enable.ssl.certificate.verification=false (@salisbury-espinosa)
  • SASL Kerberos/GSSAPI: don't treat kinit ECHILD errors as errors (@hannip)
  • Refresh broker list metadata even if no topics to refresh (#2476)
  • Correct AdminClient doc (@lowercase24, #653)
  • Update Avro example to be compliant with csh (@andreyferriyan , #668)
  • Correct Avro example typo (@AkhilGNair, #598)

Consumer:

  • Make pause|resume() synchronous, ensuring that a subsequent poll() will not return messages for the paused partitions.
  • Consumer doc fixes (@hrchu , #646, #648)

Producer:

  • Fix message timeout handling for leader-less partitions.
  • message.timeout.ms=0 is now accepted even if linger.ms > 0 (by Jeff Snyder)

v1.1.0

4 years ago

Confluent's Python client for Apache Kafka

confluent-kafka-python is based on librdkafka v1.1.0, see the librdkafka v1.1.0 release notes for a complete list of changes, enhancements, fixes and upgrade considerations.

  • In-memory SSL certificates (PEM, DER, PKCS#12) support (by @noahdav at Microsoft)
  • Use Windows Root/CA SSL Certificate Store (by @noahdav at Microsoft)
  • ssl.endpoint.identification.algorithm=https (off by default) to validate the broker hostname matches the certificate. Requires OpenSSL >= 1.0.2(included with Wheel installations))
  • Improved GSSAPI/Kerberos ticket refresh
  • Confluent monitoring interceptor package bumped to v0.11.1 (#634)

Upgrade considerations

  • Windows SSL users will no longer need to specify a CA certificate file/directory (ssl.ca.location), librdkafka will load the CA certs by default from the Windows Root Certificate Store.
  • SSL peer (broker) certificate verification is now enabled by default (disable with enable.ssl.certificate.verification=false)
  • %{broker.name} is no longer supported in sasl.kerberos.kinit.cmd since kinit refresh is no longer executed per broker, but per client instance.

SSL

New configuration properties:

  • ssl.key.pem - client's private key as a string in PEM format
  • ssl.certificate.pem - client's public key as a string in PEM format
  • enable.ssl.certificate.verification - enable(default)/disable OpenSSL's builtin broker certificate verification.
  • enable.ssl.endpoint.identification.algorithm - to verify the broker's hostname with its certificate (disabled by default).
  • Add new rd_kafka_conf_set_ssl_cert() to pass PKCS#12, DER or PEM certs in (binary) memory form to the configuration object.
  • The private key data is now securely cleared from memory after last use.

Enhancements

  • Bump message.timeout.ms max value from 15 minutes to 24 days (@sarkanyi, workaround for #2015)

Fixes

  • SASL GSSAPI/Kerberos: Don't run kinit refresh for each broker, just per client instance.
  • SASL GSSAPI/Kerberos: Changed sasl.kerberos.kinit.cmd to first attempt ticket refresh, then acquire.
  • SASL: Proper locking on broker name acquisition.
  • Consumer: max.poll.interval.ms now correctly handles blocking poll calls, allowing a longer poll timeout than the max poll interval.
  • configure: Fix libzstd static lib detection
  • PyTest pinned to latest version supporting python 2 (#634)

v1.0.1

4 years ago

Confluent's Python client for Apache Kafka

confluent-kafka-python is based on librdkafka v1.0.1, see the librdkafka v1.0.1 release notes for a complete list of changes, enhancements, fixes and upgrade considerations.

v1.0.1 is a maintenance release with the following fixes:

  • Fix consumer stall when broker connection goes down (issue #2266 introduced in v1.0.0)
  • Fix AdminAPI memory leak when broker does not support request (@souradeep100, #2314)
  • SR client: Don't disable cert verification if no ssl.ca.location set (#578)
  • Treat ECONNRESET as standard Disconnects (#2291)
  • OpenSSL version bump to 1.0.2s
  • Update/fix protocol error response codes (@benesch)
  • Update Consumer get_watermark_offsets docstring (@hrchu, #572)
  • Update Consumer subscribe docstring to include on_assign and on_revoke args (@hrchu, #571)
  • Update delivery report string formatting (@hrchu, #575)
  • Update logging configuration code example document (@soxofaan , #579)
  • Implement environment markers to fix poetry (@fishman, #583)

v1.0.0

5 years ago

Confluent's Python client for Apache Kafka v1.0.0

confluent-kafka-python is based on librdkafka v1.0.0, see the librdkafka v1.0.0 release notes for a complete list of changes, enhancements and fixes and upgrade considerations.

v1.0.0 is a major feature release:

  • Idempotent producer - guaranteed ordering, exactly-once producing) support.
  • Sparse/on-demand connections - connections are no longer maintained to all brokers in the cluster.
  • KIP-62 - max.poll.interval.ms support in the Consumer.

This release also changes configuration defaults and deprecates a set of configuration properties, make sure to read the Upgrade considerations section below.

Upgrade considerations (IMPORTANT)

Configuration default changes

The following configuration properties have changed default values, which may require application changes:

  • acks(alias request.required.acks) now defaults to all; wait for all in-sync replica brokers to ack. The previous default, 1 , only waited for an ack from the partition leader. This change places a greater emphasis on durability at a slight cost to latency. It is not recommended that you lower this value unless latency takes a higher precedence than data durability in your application.

  • broker.version.fallback now to defaults to 0.10, previously 0.9. broker.version.fallback.ms now defaults to 0. Users on Apache Kafka <0.10 must set api.version.request=false and broker.version.fallback=.. to their broker version. For users >=0.10 there is no longer any need to specify any of these properties.

  • enable.partition.eof now defaults to false. KafkaError._PARTITION_EOF was previously emitted by default to signify the consumer has reached the end of a partition. Applications which rely on this behavior must now explicitly set enable.partition.eof=true if this behavior is required. This change simplifies the more common case where consumer applications consume in an endless loop.

group.id is now required for Python consumers.

Deprecated configuration properties

The following configuration properties have been deprecated. Use of any deprecated configuration property will result in a warning when the client instance is created. The deprecated configuration properties will be removed in a future release.

librdkafka:

  • offset.store.method=file is deprecated.
  • offset.store.path is deprecated.
  • offset.store.sync.interval.ms is deprecated.
  • produce.offset.report is no longer used. Offsets are always reported.
  • queuing.strategy was an experimental property that is now deprecated.
  • reconnect.backoff.jitter.ms is no longer used, see reconnect.backoff.ms and reconnect.backoff.max.ms.
  • socket.blocking.max.ms is no longer used.
  • topic.metadata.refresh.fast.cnt is no longer used.

confluent_kafka:

  • default.topic.config is deprecated.
  • `CachedSchemaRegistryClient: url: was str, now conf dict with all application config properties

Idempotent Producer

This release adds support for Idempotent Producer, providing exactly-once producing and guaranteed ordering of messages.

Enabling idempotence is as simple as setting the enable.idempotence configuration property to true.

There are no required application changes, but it is recommended to add support for the newly introduced fatal errors that will be triggered when the idempotent producer encounters an unrecoverable error that would break the ordering or duplication guarantees.

See Idempotent Producer in the manual and the Exactly once semantics blog post for more information.

Sparse connections

In previous releases librdkafka would maintain open connections to all brokers in the cluster and the bootstrap servers.

With this release librdkafka now connects to a single bootstrap server to retrieve the full broker list, and then connects to the brokers it needs to communicate with: partition leaders, group coordinators, etc.

For large scale deployments this greatly reduces the number of connections between clients and brokers, and avoids the repeated idle connection closes for unused connections.

Sparse connections is on by default (recommended setting), the old behavior of connecting to all brokers in the cluster can be re-enabled by setting enable.sparse.connections=false.

See Sparse connections in the manual for more information.

Original issue librdkafka #825.

KIP-62 - max.poll.interval.ms is enforced

This release adds support for max.poll.interval.ms (KIP-62), which requires the application to call consumer.poll() at least every max.poll.interval.ms. Failure to do so will make the consumer automatically leave the group, causing a group rebalance, and not rejoin the group until the application has called ..poll() again, triggering yet another group rebalance. max.poll.interval.ms is set to 5 minutes by default.

Enhancements

  • OpenSSL version bumped to 1.0.2r
  • AvroProducer now supports encoding with fastavro (#492)
  • Simplify CachedSchemaRegistryClient configuration with configuration dict for application configs
  • Add Delete Schema support to CachedSchemaRegistryClient
  • CachedSchemaRegistryClient now supports HTTP Basic Auth (#440)
  • MessageSerializer now supports specifying reader schema (#470)

Fixes

  • Fix crash when calling Consumer.consume without setting group.id(now required)
  • CachedSchemaRegistryClient handles get_compatibility properly

Build/installation/tooling

  • Integration tests moved to docker-compose to aid in cluster set-up/tear-down
  • Runner script ./tests/run.sh added to simplify unit and integration test execution

v0.11.6

5 years ago

See librdkafka v0.11.6 release notes for enhancements and fixes in librdkafka.

New Features

  • Confluent Monitoring Interceptors are now included with Linux and OSX binary wheel distributions. (#464)
  • Experimental binary wheel distributions for Windows environments. (#451)

Enhancements

  • OpenSSL version bump to 1.0.2p. (#437)
  • Topic configurations have been moved into the global configuration dictionary to simplify configuration. The property default.topic.configuration has been deprecated and will be removed in 1.0, but still has precedence to topic configuration specified in the global configuration dictionary. (#446)

Fixes

  • Handle debug configuration property prior to plugin.library.paths for enhanced debugging. (#464)
  • Fix memory leak in message headers. (#458)
  • Safely release handler resources. (#434, @coldeasy)