A horizontally scalable, highly available, multi-tenant, long term Prometheus.
This release contains 168 contributions from 29 contributors. We also have 16 new contributors. Thank you all for the contributions!
Some notable changes release are:
mem-ballast-size-bytes
flag has been marked as deprecated and not functional anymore-querier.ingester-streaming
flag has been marked as deprecated and ingester streaming is always enabled nowquerier.iterators
and querier.batch-iterators
flags have been marked as deprecated and batch iterator is always enabled in Querier nowconnection_string
to support authenticating via SAS token. Marked msi_resource
config as deprecating. #5645-blocks-storage.bucket-store.index-cache.multilevel.max-async-concurrency
and -blocks-storage.bucket-store.index-cache.multilevel.max-async-buffer-size
configs and metric cortex_store_multilevel_index_cache_backfill_dropped_items_total
for number of dropped items. #5661distributor_samples_in_total
. #5714cortex_ruler_write_requests_total
, cortex_ruler_write_requests_failed_total
, cortex_ruler_queries_total
, cortex_ruler_queries_failed_total
, and cortex_ruler_query_seconds_total
metrics for the tenant when the ruler deletes the manager for the tenant. #5772mem-ballast-size-bytes
flag as deprecated. #5816-querier.ingester-streaming
flag as deprecated. Now query ingester streaming is always enabled. #5817-blocks-storage.bucket-store.block-discovery-strategy
to configure different block listing strategy. Reverted the current recursive block listing mechanism and use the strategy Concurrent
as in 1.15. #5828-blocks-storage.s3.send-content-md5
flag and set default checksum algorithm to MD5. #5870querier.iterators
and querier.batch-iterators
flags as deprecated. Now querier always use batch iterators. #5868cortex_ingester_tsdb_data_replay_duration_seconds
. #5477kuberesolver
to resolve endpoints address with kubernetes://
prefix as Kubernetes service. #5731tracing.otel.round-robin
flag to use round_robin
gRPC client side LB policy for sending OTLP traces. #5731ruler.concurrent-evals-enabled
flag to enable concurrent evaluation within a single rule group for independent rules. Maximum concurrency can be configured via ruler.max-concurrent-evals
. #5766zone_results_quorum_metadata
. When querying ingesters using metadata APIs such as label names and values, only results from quorum number of zones will be included and merged. #5779set_async_circuit_breaker_config
to utilize the circuit breaker pattern for dynamically thresholding asynchronous set operations. Implemented in both memcached and redis cache clients. #5789experimental.ruler.api-deduplicate-rules
flag to remove duplicate rule groups from the Prometheus compatible rules API endpoint. Add experimental ruler.ring.replication-factor
and ruler.ring.zone-awareness-enabled
flags to configure rule group replication, but only the first ruler in the replicaset evaluates the rule group, the rest will just hold a copy as backup. Add experimental experimental.ruler.api-enable-rules-backup
flag to configure rulers to send the rule group backups stored in the replicaset to handle events when a ruler is down during an API request to list rules. #5782-store-gateway.enabled-tenants
and -store-gateway.disabled-tenants
to explicitly enable or disable store-gateway for specific tenants. #5638cortex_compactor_start_duration_seconds
. #5683max_backfill_items
to cap max items to backfill per async operation. #5686query stats
log. #5703-server.log-request-headers
enables logging HTTP request headers, -server.log-request-headers-exclude-list
allows users to specify headers which should not be logged. #5744querier.store-gateway-query-stats-enabled
to enable or disable store gateway query stats log. #5749cortex_ingester_max_inflight_query_requests
. #5798query_storage_wall_time
to Query Frontend and Ruler query stats log for wall time spent on fetching data from storage. Query evaluation is not included. #5799-querier.ignore-max-query-length
flag to disable max query length check at Querier. #5808ruler/manager.go
. #5805-ingester.tokens-generator-strategy=minimize-spread
flag to enable the new minimize spread token generator strategy. #5855Ownership Diff From Expected
column in the ring table to indicate the extent to which the ownership of a specific ingester differs from the expected ownership. #5889cache_size
config correctly. #5734Full Changelog: https://github.com/cortexproject/cortex/compare/v1.16.1...v1.17.0
Over v1.17.0-rc.0 to include one bug fix and one change.
experimental.ruler.api-enable-rules-backup
flag and use ruler.ring.replication-factor
to check if rules backup is enabled. #5901This release contains 166 contributions from 29 contributors. We also have 16 new contributors. Thank you all for the contributions!
Some notable changes release are:
mem-ballast-size-bytes
flag has been marked as deprecated and not functional anymore-querier.ingester-streaming
flag has been marked as deprecated and ingester streaming is always enabled nowquerier.iterators
and querier.batch-iterators
flags have been marked as deprecated and batch iterator is always enabled in Querier nowconnection_string
to support authenticating via SAS token. Marked msi_resource
config as deprecating. #5645-blocks-storage.bucket-store.index-cache.multilevel.max-async-concurrency
and -blocks-storage.bucket-store.index-cache.multilevel.max-async-buffer-size
configs and metric cortex_store_multilevel_index_cache_backfill_dropped_items_total
for number of dropped items. #5661distributor_samples_in_total
. #5714cortex_ruler_write_requests_total
, cortex_ruler_write_requests_failed_total
, cortex_ruler_queries_total
, cortex_ruler_queries_failed_total
, and cortex_ruler_query_seconds_total
metrics for the tenant when the ruler deletes the manager for the tenant. #5772mem-ballast-size-bytes
flag as deprecated. #5816-querier.ingester-streaming
flag as deprecated. Now query ingester streaming is always enabled. #5817-blocks-storage.bucket-store.block-discovery-strategy
to configure different block listing strategy. Reverted the current recursive block listing mechanism and use the strategy Concurrent
as in 1.15. #5828-blocks-storage.s3.send-content-md5
flag and set default checksum algorithm to MD5. #5870querier.iterators
and querier.batch-iterators
flags as deprecated. Now querier always use batch iterators. #5868kuberesolver
to resolve OTLP endpoints address with kubernetes://
prefix as Kubernetes service. #5731tracing.otel.round-robin
flag to use round_robin
gRPC client side LB policy for sending OTLP traces. #5731ruler.concurrent-evals-enabled
flag to enable concurrent evaluation within a single rule group for independent rules. Maximum concurrency can be configured via ruler.max-concurrent-evals
. #5766zone_results_quorum_metadata
. When querying ingesters using metadata APIs such as label names and values, only results from quorum number of zones will be included and merged. #5779set_async_circuit_breaker_config
to utilize the circuit breaker pattern for dynamically thresholding asynchronous set operations. Implemented in both memcached and redis cache clients. #5789experimental.ruler.api-deduplicate-rules
flag to remove duplicate rule groups from the Prometheus compatible rules API endpoint. Add experimental ruler.ring.replication-factor
and ruler.ring.zone-awareness-enabled
flags to configure rule group replication, but only the first ruler in the replicaset evaluates the rule group, the rest will just hold a copy as backup. Add experimental experimental.ruler.api-enable-rules-backup
flag to configure rulers to send the rule group backups stored in the replicaset to handle events when a ruler is down during an API request to list rules. #5782-ingester.tokens-generator-strategy=minimize-spread
flag to enable the new minimize spread token generator strategy. #5855Ownership Diff From Expected
column in the ring table to indicate the extent to which the ownership of a specific ingester differs from the expected ownership. #5889cortex_ingester_tsdb_data_replay_duration_seconds
. #5477-store-gateway.enabled-tenants
and -store-gateway.disabled-tenants
to explicitly enable or disable store-gateway for specific tenants. #5638cortex_compactor_start_duration_seconds
. #5683max_backfill_items
to cap max items to backfill per async operation. #5686query stats
log. #5703-server.log-request-headers
enables logging HTTP request headers, -server.log-request-headers-exclude-list
allows users to specify headers which should not be logged. #5744querier.store-gateway-query-stats-enabled
to enable or disable store gateway query stats log. #5749cortex_ingester_max_inflight_query_requests
. #5798query_storage_wall_time
to Query Frontend and Ruler query stats log for wall time spent on fetching data from storage. Query evaluation is not included. #5799-querier.ignore-max-query-length
flag to disable max query length check at Querier. #5808ruler/manager.go
. #5805cache_size
config correctly. #5734ingestion_tenant_shard_size
set to 0, default sharding strategy should be used. #5189keep_firing_for
field in alert rules. #5823Full Changelog: https://github.com/cortexproject/cortex/compare/v1.16.1...v1.17.0-rc.0
This release includes two security fixes:
This release contains 227 contributions from 27 contributors. We also have 10 new contributors. Thank you all for the contribution!
Some notable changes release are:
cortex_alertmanager_notifications_failed_total
. #5409cortex_ruler_write_requests_total
, cortex_ruler_write_requests_failed_total
, cortex_ruler_queries_total
, and cortex_ruler_queries_failed_total
metrics. #5312native-histogram-sample
on the cortex_discarded_samples_total
to keep track of discarded native histogram samples. #5289cortex_bucket_store_cached_postings_compression_time_seconds
to cortex_bucket_store_cached_postings_compression_time_seconds_total
. #5431cortex_bucket_store_cached_series_fetch_duration_seconds
to cortex_bucket_store_series_fetch_duration_seconds
and cortex_bucket_store_cached_postings_fetch_duration_seconds
to cortex_bucket_store_postings_fetch_duration_seconds
. Add new metric cortex_bucket_store_chunks_fetch_duration_seconds
. #5448idle_timeout
, max_conn_age
, pool_size
, min_idle_conns
fields for Redis index cache and caching bucket. #5448-store-gateway.sharding-ring.zone-stable-shuffle-sharding
to enable store gateway to use zone stable shuffle sharding. #5489series_max_size
and chunk_max_size
to bucket index. #5489cortex_bucket_store_chunk_pool_returned_bytes_total
and cortex_bucket_store_chunk_pool_requested_bytes_total
to cortex_bucket_store_chunk_pool_operation_bytes_total
. #5552api.build-info-enabled
to enable it. #5533dynamodb_kv_read_capacity_total
to dynamodb_kv_consumed_capacity_total
and include Delete, Put, Batch dimension. #5487-querier.max-subquery-steps
to configure subquery max steps check. By default, the limit is set to 0, which is disabled. #5656Limit
field on RuleGroup. #5528-admin-limit-message
to customize the message contained in limit errors.#5460blocks-storage.bucket-store.lazy-expanded-postings-enabled
and new metrics cortex_bucket_store_lazy_expanded_postings_total
, cortex_bucket_store_lazy_expanded_posting_size_bytes_total
and cortex_bucket_store_lazy_expanded_posting_series_overfetched_size_bytes_total
. #5556.max_downloaded_bytes_per_request
to limit max bytes to download per store gateway request. #5179-alertmanager.alertmanager-client.grpc-max-send-msg-size
and -alertmanager.alertmanager-client.grpc-max-recv-msg-size
to configure alert manager grpc client message size limits. #5338-alertmanager.api-concurrency
to configure alert manager api concurrency limit. #5412-store-gateway.sharding-ring.keep-instance-in-the-ring-on-shutdown
to skip unregistering instance from the ring in shutdown. #5421-compactor.ring.tokens-file-path
to store generated tokens locally. #5432-frontend.retry-on-too-many-outstanding-requests
to re-enqueue 429 requests if there are multiple query-schedulers available. #5496-blocks-storage.bucket-store.max-inflight-requests
for store gateways to reject further series requests upon reaching the limit. #5553cortex_bucket_store_block_load_duration_seconds
histogram to track time to load blocks. #5580-alertmanager.alerts-gc-interval
to configure alerts Garbage collection interval. #5550cortex_rejected_queries_total
metric for throttled queries. #5356SampleStream
. #5349cortex_alertmanager_dispatcher_aggregation_groups
and cortex_alertmanager_dispatcher_alert_processing_duration_seconds
metrics for dispatcher. #5592blocks-storage.bucket-store.series-batch-size
to control how many series to fetch per batch in Store Gateway. #5582.cortex_ruler_rule_group_load_duration_seconds
and cortex_ruler_rule_group_sync_duration_seconds
metrics. #5609accept-malformed-index
to Cortex compactor. #5334log.Valuer
evaluation for disallowed levels. #5297puller-sync-time
to allow different pull time for ring. #5357max_concurrent
as a metric. #5362cortex_bucket_store_sent_chunk_size_bytes
, cortex_bucket_store_postings_size_bytes
and cortex_bucket_store_empty_postings_total
. #5397estimated_max_series_size_bytes
and estimated_max_chunk_size_bytes
to address data overfetch. #5401-distributor.sign_write_requests
flag to sign the write requests. #5430unregister_on_shutdown
to be configurable. #5503cortex_bucket_store_chunk_refetches_total
for number of chunk refetches. #5532cortex_store_multilevel_index_cache_fetch_duration_seconds
and cortex_store_multilevel_index_cache_backfill_duration_seconds
to measure fetch and backfill latency. #5596cortex_ingester_tsdb_head_samples_appended_total
, cortex_ingester_tsdb_head_out_of_order_samples_appended_total
, cortex_ingester_tsdb_snapshot_replay_error_total
, cortex_ingester_tsdb_sample_ooo_delta
and cortex_ingester_tsdb_mmap_chunks_total
. #5624wait_instance_time_out
to context to avoid waiting forever. #5581ResourceExhausted
status code from store gateway to 422 limit error. #5286JOINING
state to read operation. #5346DialTimeout
period. #5392Over v1.16.0-rc.0 to include one bug fix and one change.
This release contains 227 contributions from 27 contributors. We also have 10 new contributors. Thank you all for the contribution!
Some notable changes release are:
cortex_alertmanager_notifications_failed_total
. #5409cortex_ruler_write_requests_total
, cortex_ruler_write_requests_failed_total
, cortex_ruler_queries_total
, and cortex_ruler_queries_failed_total
metrics. #5312native-histogram-sample
on the cortex_discarded_samples_total
to keep track of discarded native histogram samples. #5289cortex_bucket_store_cached_postings_compression_time_seconds
to cortex_bucket_store_cached_postings_compression_time_seconds_total
. #5431cortex_bucket_store_cached_series_fetch_duration_seconds
to cortex_bucket_store_series_fetch_duration_seconds
and cortex_bucket_store_cached_postings_fetch_duration_seconds
to cortex_bucket_store_postings_fetch_duration_seconds
. Add new metric cortex_bucket_store_chunks_fetch_duration_seconds
. #5448idle_timeout
, max_conn_age
, pool_size
, min_idle_conns
fields for Redis index cache and caching bucket. #5448-store-gateway.sharding-ring.zone-stable-shuffle-sharding
to enable store gateway to use zone stable shuffle sharding. #5489series_max_size
and chunk_max_size
to bucket index. #5489cortex_bucket_store_chunk_pool_returned_bytes_total
and cortex_bucket_store_chunk_pool_requested_bytes_total
to cortex_bucket_store_chunk_pool_operation_bytes_total
. #5552api.build-info-enabled
to enable it. #5533dynamodb_kv_read_capacity_total
to dynamodb_kv_consumed_capacity_total
and include Delete, Put, Batch dimension. #5487Limit
field on RuleGroup. #5528-admin-limit-message
to customize the message contained in limit errors.#5460blocks-storage.bucket-store.lazy-expanded-postings-enabled
and new metrics cortex_bucket_store_lazy_expanded_postings_total
, cortex_bucket_store_lazy_expanded_posting_size_bytes_total
and cortex_bucket_store_lazy_expanded_posting_series_overfetched_size_bytes_total
. #5556.max_downloaded_bytes_per_request
to limit max bytes to download per store gateway request. #5179-alertmanager.alertmanager-client.grpc-max-send-msg-size
and -alertmanager.alertmanager-client.grpc-max-recv-msg-size
to configure alert manager grpc client message size limits. #5338-alertmanager.api-concurrency
to configure alert manager api concurrency limit. #5412-store-gateway.sharding-ring.keep-instance-in-the-ring-on-shutdown
to skip unregistering instance from the ring in shutdown. #5421-compactor.ring.tokens-file-path
to store generated tokens locally. #5432-frontend.retry-on-too-many-outstanding-requests
to re-enqueue 429 requests if there are multiple query-schedulers available. #5496-blocks-storage.bucket-store.max-inflight-requests
for store gateways to reject further series requests upon reaching the limit. #5553cortex_bucket_store_block_load_duration_seconds
histogram to track time to load blocks. #5580-alertmanager.alerts-gc-interval
to configure alerts Garbage collection interval. #5550cortex_rejected_queries_total
metric for throttled queries. #5356SampleStream
. #5349cortex_alertmanager_dispatcher_aggregation_groups
and cortex_alertmanager_dispatcher_alert_processing_duration_seconds
metrics for dispatcher. #5592blocks-storage.bucket-store.series-batch-size
to control how many series to fetch per batch in Store Gateway. #5582.cortex_ruler_rule_group_load_duration_seconds
and cortex_ruler_rule_group_sync_duration_seconds
metrics. #5609accept-malformed-index
to Cortex compactor. #5334log.Valuer
evaluation for disallowed levels. #5297puller-sync-time
to allow different pull time for ring. #5357max_concurrent
as a metric. #5362cortex_bucket_store_sent_chunk_size_bytes
, cortex_bucket_store_postings_size_bytes
and cortex_bucket_store_empty_postings_total
. #5397estimated_max_series_size_bytes
and estimated_max_chunk_size_bytes
to address data overfetch. #5401-distributor.sign_write_requests
flag to sign the write requests. #5430unregister_on_shutdown
to be configurable. #5503cortex_bucket_store_chunk_refetches_total
for number of chunk refetches. #5532cortex_store_multilevel_index_cache_fetch_duration_seconds
and cortex_store_multilevel_index_cache_backfill_duration_seconds
to measure fetch and backfill latency. #5596cortex_ingester_tsdb_head_samples_appended_total
, cortex_ingester_tsdb_head_out_of_order_samples_appended_total
, cortex_ingester_tsdb_snapshot_replay_error_total
, cortex_ingester_tsdb_sample_ooo_delta
and cortex_ingester_tsdb_mmap_chunks_total
. #5624wait_instance_time_out
to context to avoid waiting forever. #5581ResourceExhausted
status code from store gateway to 422 limit error. #5286JOINING
state to read operation. #5346DialTimeout
period. #5392This release includes:
This release includes Go runtime upgrade to 1.20.4 to address critical CVE.
This release includes:
ResourceExhausted
status code from store gateway to 422 limit error. #5286