Releases: cortexproject/cortex
Cortex 1.5.0-rc.0
Changelog
Cortex
- [CHANGE] Blocks storage: update the default HTTP configuration values for the S3 client to the upstream Thanos default values. #3244
-blocks-storage.s3.http.idle-conn-timeout
is set 90 seconds.-blocks-storage.s3.http.response-header-timeout
is set to 2 minutes.
- [CHANGE] Improved shuffle sharding support in the write path. This work introduced some config changes: #3090
- Introduced
-distributor.sharding-strategy
CLI flag (and its respectivesharding_strategy
YAML config option) to explicitly specify which sharding strategy should be used in the write path -experimental.distributor.user-subring-size
flag renamed to-distributor.ingestion-tenant-shard-size
user_subring_size
limit YAML config option renamed toingestion_tenant_shard_size
- Introduced
- [CHANGE] Dropped "blank Alertmanager configuration; using fallback" message from Info to Debug level. #3205
- [CHANGE] Zone-awareness replication for time-series now should be explicitly enabled in the distributor via the
-distributor.zone-awareness-enabled
CLI flag (or its respective YAML config option). Before, zone-aware replication was implicitly enabled if a zone was set on ingesters. #3200 - [CHANGE] Removed the deprecated CLI flag
-config-yaml
. You should use-schema-config-file
instead. #3225 - [CHANGE] Enforced the HTTP method required by some API endpoints which did (incorrectly) allow any method before that. #3228
GET /
GET /config
GET /debug/fgprof
GET /distributor/all_user_stats
GET /distributor/ha_tracker
GET /all_user_stats
GET /ha-tracker
GET /api/v1/user_stats
GET /api/v1/chunks
GET <legacy-http-prefix>/user_stats
GET <legacy-http-prefix>/chunks
GET /services
GET /multitenant_alertmanager/status
GET /status
(alertmanager microservice)GET|POST /ingester/ring
GET|POST /ring
GET|POST /store-gateway/ring
GET|POST /compactor/ring
GET|POST /ingester/flush
GET|POST /ingester/shutdown
GET|POST /flush
GET|POST /shutdown
GET|POST /ruler/ring
POST /api/v1/push
POST <legacy-http-prefix>/push
POST /push
POST /ingester/push
- [CHANGE] Renamed CLI flags to configure the network interface names from which automatically detect the instance IP. #3295
-compactor.ring.instance-interface
renamed to-compactor.ring.instance-interface-names
-store-gateway.sharding-ring.instance-interface
renamed to-store-gateway.sharding-ring.instance-interface-names
-distributor.ring.instance-interface
renamed to-distributor.ring.instance-interface-names
-ruler.ring.instance-interface
renamed to-ruler.ring.instance-interface-names
- [CHANGE] Renamed
-<prefix>.redis.enable-tls
CLI flag to-<prefix>.redis.tls-enabled
, and its respective YAML config option fromenable_tls
totls_enabled
. #3298 - [CHANGE] Increased default
-<prefix>.redis.timeout
from100ms
to500ms
. #3301 - [CHANGE]
cortex_alertmanager_config_invalid
has been removed in favor ofcortex_alertmanager_config_last_reload_successful
. #3289 - [CHANGE] Query-frontend: POST requests whose body size exceeds 10MiB will be rejected. The max body size can be customised via
-frontend.max-body-size
. #3276 - [FEATURE] Shuffle sharding: added support for shuffle-sharding queriers in the query-frontend. When configured (
-frontend.max-queriers-per-tenant
globally, or using per-tenant limitmax_queriers_per_tenant
), each tenants's requests will be handled by different set of queriers. #3113 #3257 - [FEATURE] Shuffle sharding: added support for shuffle-sharding ingesters on the read path. When ingesters shuffle-sharding is enabled and
-querier.shuffle-sharding-ingesters-lookback-period
is set, queriers will fetch in-memory series from the minimum set of required ingesters, selecting only ingesters which may have received series since 'now - lookback period'. #3252 - [FEATURE] Query-frontend: added
compression
config to support results cache with compression. #3217 - [FEATURE] Add OpenStack Swift support to blocks storage. #3303
- [FEATURE] Added support for applying Prometheus relabel configs on series received by the distributor. A
metric_relabel_configs
field has been added to the per-tenant limits configuration. #3329 - [FEATURE] Support for Cassandra client SSL certificates. #3384
- [ENHANCEMENT] Ruler: Introduces two new limits
-ruler.max-rules-per-rule-group
and-ruler.max-rule-groups-per-tenant
to control the number of rules per rule group and the total number of rule groups for a given user. They are disabled by default. #3366 - [ENHANCEMENT] Allow to specify multiple comma-separated Cortex services to
-target
CLI option (or its respective YAML config option). For example,-target=all,compactor
can be used to start Cortex single-binary with compactor as well. #3275 - [ENHANCEMENT] Expose additional HTTP configs for the S3 backend client. New flag are listed below: #3244
-blocks-storage.s3.http.idle-conn-timeout
-blocks-storage.s3.http.response-header-timeout
-blocks-storage.s3.http.insecure-skip-verify
- [ENHANCEMENT] Added
cortex_query_frontend_connected_clients
metric to show the number of workers currently connected to the frontend. #3207 - [ENHANCEMENT] Shuffle sharding: improved shuffle sharding in the write path. Shuffle sharding now should be explicitly enabled via
-distributor.sharding-strategy
CLI flag (or its respective YAML config option) and guarantees stability, consistency, shuffling and balanced zone-awareness properties. #3090 #3214 - [ENHANCEMENT] Ingester: added new metric
cortex_ingester_active_series
to track active series more accurately. Also added options to control whether active series tracking is enabled (-ingester.active-series-enabled
, defaults to false), and how often this metric is updated (-ingester.active-series-update-period
) and max idle time for series to be considered inactive (-ingester.active-series-idle-timeout
). #3153 - [ENHANCEMENT] Store-gateway: added zone-aware replication support to blocks replication in the store-gateway. #3200
- [ENHANCEMENT] Store-gateway: exported new metrics. #3231
cortex_bucket_store_cached_series_fetch_duration_seconds
cortex_bucket_store_cached_postings_fetch_duration_seconds
cortex_bucket_stores_gate_queries_max
- [ENHANCEMENT] Added
-version
flag to Cortex. #3233 - [ENHANCEMENT] Hash ring: added instance registered timestamp to the ring. #3248
- [ENHANCEMENT] Reduce tail latency by smoothing out spikes in rate of chunk flush operations. #3191
- [ENHANCEMENT] User Cortex as User Agent in http requests issued by Configs DB client. #3264
- [ENHANCEMENT] Experimental Ruler API: Fetch rule groups from object storage in parallel. #3218
- [ENHANCEMENT] Chunks GCS object storage client uses the
fields
selector to limit the payload size when listing objects in the bucket. #3218 #3292 - [ENHANCEMENT] Added shuffle sharding support to ruler. Added new metric
cortex_ruler_sync_rules_total
. #3235 - [ENHANCEMENT] Return an explicit error when the store-gateway is explicitly requested without a blocks storage engine. #3287
- [ENHANCEMENT] Ruler: only load rules that belong to the ruler. Improves rules synching performances when ruler sharding is enabled. #3269
- [ENHANCEMENT] Added
-<prefix>.redis.tls-insecure-skip-verify
flag. #3298 - [ENHANCEMENT] Added
cortex_alertmanager_config_last_reload_successful_seconds
metric to show timestamp of last successful AM config reload. #3289 - [ENHANCEMENT] Blocks storage: reduced number of bucket listing operations to list block content (applies to newly created blocks only). #3363
- [ENHANCEMENT] Ruler: Include the tenant ID on the notifier logs. #3372
- [ENHANCEMENT] Blocks storage Compactor: Added
-compactor.enabled-tenants
and-compactor.disabled-tenants
to explicitly enable or disable compaction of specific tenants. #3385 - [ENHANCEMENT] Blocks storage ingester: Creating checkpoint only once even when there are multiple Head compactions in a single
Compact()
call. #3373 - [BUGFIX] Blocks storage ingester: Read repair memory-mapped chunks file which can end up being empty on abrupt shutdowns combined with faulty disks. #3373
- [BUGFIX] Blocks storage ingester: Close TSDB resources on failed startup preventing ingester OOMing. #3373
- [BUGFIX] No-longer-needed ingester operations for queries triggered by queriers and rulers are now canceled. #3178
- [BUGFIX] Ruler: directories in the configured
rules-path
will be removed on startup and shutdown in order to ensure they don't persist between runs. #3195 - [BUGFIX] Handle hash-collisions in the query path. #3192
- [BUGFIX] Check for postgres rows errors. #3197
- [BUGFIX] Ruler Experimental API: Don't allow rule groups without names or empty rule groups. #3210
- [BUGFIX] Experimental Alertmanager API: Do not allow empty Alertmanager configurations or bad template filenames to be submitted through the configuration API. #3185
- [BUGFIX] Reduce failures to update heartbeat when using Consul. #3259
- [BUGFIX] When using ruler sharding, moving all user rule groups from ruler to a different one and then back could end up with some user groups not being evaluated at all. #3235
- [BUGFIX] Fixed shuffle sharding consistency when zone-awareness is enabled and the shard size is increased or instances in a new zone are added. #3299
- [BUGFIX] Use a valid grpc header when logging IP addresses. #3307
- [BUGFIX] Fixed the metric
cortex_prometheus_rule_group_duration_seconds
in the Ruler, it wouldn't report any values. #3310 - [BUGFIX] Fixed gRPC connections leaking in rulers when rulers sharding is enabled and APIs called. #3314
- [BUGFIX] Fixed shuffle sharding consistency when zone-awareness is enabled and the shard size is increased or instances...
Cortex 1.4.0
This Cortex release features 112 contributions from 32 authors and exciting news!
Highlights
- Cortex blocks storage is now GA.
- Cassandra support for the chunks storage is now GA.
- Redis caching backend now supports Redis sentinel and Redis cluster too.
- Introduced shuffle sharding support to store-gateway blocks sharding (blocks storage).
- The ruler and alertmanager got several improvements
- Last, but not the least, many enhancements, optimisations and bug fixes.
Please refer to the changelog for full list of changes and improvements.
Changelog
- [CHANGE] Cassandra backend support is now GA (stable). #3180
- [CHANGE] Blocks storage is now GA (stable). The
-experimental
prefix has been removed from all CLI flags related to the blocks storage (no YAML config changes). #3180-experimental.blocks-storage.*
flags renamed to-blocks-storage.*
-experimental.store-gateway.*
flags renamed to-store-gateway.*
-experimental.querier.store-gateway-client.*
flags renamed to-querier.store-gateway-client.*
-experimental.querier.store-gateway-addresses
flag renamed to-querier.store-gateway-addresses
-store-gateway.replication-factor
flag renamed to-store-gateway.sharding-ring.replication-factor
-store-gateway.tokens-file-path
flag renamed tostore-gateway.sharding-ring.tokens-file-path
- [CHANGE] Ingester: Removed deprecated untyped record from chunks WAL. Only if you are running
v1.0
or below, it is recommended to first upgrade tov1.1
/v1.2
/v1.3
and run it for a day before upgrading tov1.4
to avoid data loss. #3115 - [CHANGE] Distributor API endpoints are no longer served unless target is set to
distributor
orall
. #3112 - [CHANGE] Increase the default Cassandra client replication factor to 3. #3007
- [CHANGE] Blocks storage: removed the support to transfer blocks between ingesters on shutdown. When running the Cortex blocks storage, ingesters are expected to run with a persistent disk. The following metrics have been removed: #2996
cortex_ingester_sent_files
cortex_ingester_received_files
cortex_ingester_received_bytes_total
cortex_ingester_sent_bytes_total
- [CHANGE] The buckets for the
cortex_chunk_store_index_lookups_per_query
metric have been changed to 1, 2, 4, 8, 16. #3021 - [CHANGE] Blocks storage: the
operation
label valuegetrange
has changed intoget_range
for the metricsthanos_store_bucket_cache_operation_requests_total
andthanos_store_bucket_cache_operation_hits_total
. #3000 - [CHANGE] Experimental Delete Series:
/api/v1/admin/tsdb/delete_series
and/api/v1/admin/tsdb/cancel_delete_request
purger APIs to return status code204
instead of200
for success. #2946 - [CHANGE] Histogram
cortex_memcache_request_duration_seconds
method
label value changes fromMemcached.Get
toMemcached.GetBatched
for batched lookups, and is not reported for non-batched lookups (label valueMemcached.GetMulti
remains, and had exactly the same value asGet
in nonbatched lookups). The same change applies to tracing spans. #3046 - [CHANGE] TLS server validation is now enabled by default, a new parameter
tls_insecure_skip_verify
can be set to true to skip validation optionally. #3030 - [CHANGE]
cortex_ruler_config_update_failures_total
has been removed in favor ofcortex_ruler_config_last_reload_successful
. #3056 - [CHANGE]
ruler.evaluation_delay_duration
field in YAML config has been moved and renamed tolimits.ruler_evaluation_delay_duration
. #3098 - [CHANGE] Removed obsolete
results_cache.max_freshness
from YAML config (deprecated since Cortex 1.2). #3145 - [CHANGE] Removed obsolete
-promql.lookback-delta
option (deprecated since Cortex 1.2, replaced with-querier.lookback-delta
). #3144 - [CHANGE] Cache: added support for Redis Cluster and Redis Sentinel. #2961
- The following changes have been made in Redis configuration:
-redis.master_name
added-redis.db
added-redis.max-active-conns
changed to-redis.pool-size
-redis.max-conn-lifetime
changed to-redis.max-connection-age
-redis.max-idle-conns
removed-redis.wait-on-pool-exhaustion
removed
- [CHANGE] TLS configuration for gRPC, HTTP and etcd clients is now marked as experimental. These features are not yet fully baked, and we expect possible small breaking changes in Cortex 1.5. #3198
- [CHANGE] Fixed store-gateway CLI flags inconsistencies. #3201
-store-gateway.replication-factor
flag renamed to-store-gateway.sharding-ring.replication-factor
-store-gateway.tokens-file-path
flag renamed tostore-gateway.sharding-ring.tokens-file-path
- [FEATURE] Logging of the source IP passed along by a reverse proxy is now supported by setting the
-server.log-source-ips-enabled
. For non standard headers the settings-server.log-source-ips-header
and-server.log-source-ips-regex
can be used. #2985 - [FEATURE] Blocks storage: added shuffle sharding support to store-gateway blocks sharding. Added the following additional metrics to store-gateway: #3069
cortex_bucket_stores_tenants_discovered
cortex_bucket_stores_tenants_synced
- [FEATURE] Experimental blocksconvert: introduce an experimental tool
blocksconvert
to migrate long-term storage chunks to blocks. #3092 #3122 #3127 #3162 - [ENHANCEMENT] Add support for azure storage in China, German and US Government environments. #2988
- [ENHANCEMENT] Query-tee: added a small tolerance to floating point sample values comparison. #2994
- [ENHANCEMENT] Query-tee: add support for doing a passthrough of requests to preferred backend for unregistered routes #3018
- [ENHANCEMENT] Expose
storage.aws.dynamodb.backoff_config
configuration file field. #3026 - [ENHANCEMENT] Added
cortex_request_message_bytes
andcortex_response_message_bytes
histograms to track received and sent gRPC message and HTTP request/response sizes. Addedcortex_inflight_requests
gauge to track number of inflight gRPC and HTTP requests. #3064 - [ENHANCEMENT] Publish ruler's ring metrics. #3074
- [ENHANCEMENT] Add config validation to the experimental Alertmanager API. Invalid configs are no longer accepted. #3053
- [ENHANCEMENT] Add "integration" as a label for
cortex_alertmanager_notifications_total
andcortex_alertmanager_notifications_failed_total
metrics. #3056 - [ENHANCEMENT] Add
cortex_ruler_config_last_reload_successful
andcortex_ruler_config_last_reload_successful_seconds
to check status of users rule manager. #3056 - [ENHANCEMENT] The configuration validation now fails if an empty YAML node has been set for a root YAML config property. #3080
- [ENHANCEMENT] Memcached dial() calls now have a circuit-breaker to avoid hammering a broken cache. #3051, #3189
- [ENHANCEMENT]
-ruler.evaluation-delay-duration
is now overridable as a per-tenant limit,ruler_evaluation_delay_duration
. #3098 - [ENHANCEMENT] Add TLS support to etcd client. #3102
- [ENHANCEMENT] When a tenant accesses the Alertmanager UI or its API, if we have valid
-alertmanager.configs.fallback
we'll use that to start the manager and avoid failing the request. #3073 - [ENHANCEMENT] Add
DELETE api/v1/rules/{namespace}
to the Ruler. It allows all the rule groups of a namespace to be deleted. #3120 - [ENHANCEMENT] Experimental Delete Series: Retry processing of Delete requests during failures. #2926
- [ENHANCEMENT] Improve performance of QueryStream() in ingesters. #3177
- [ENHANCEMENT] Modules included in "All" target are now visible in output of
-modules
CLI flag. #3155 - [ENHANCEMENT] Added
/debug/fgprof
endpoint to debug running Cortex process usingfgprof
. This adds up to the existing/debug/...
endpoints. #3131 - [ENHANCEMENT] Blocks storage: optimised
/api/v1/series
for blocks storage. (#2976) - [BUGFIX] Ruler: when loading rules from "local" storage, check for directory after resolving symlink. #3137
- [BUGFIX] Query-frontend: Fixed rounding for incoming query timestamps, to be 100% Prometheus compatible. #2990
- [BUGFIX] Querier: Merge results from chunks and blocks ingesters when using streaming of results. #3013
- [BUGFIX] Querier: query /series from ingesters regardless the
-querier.query-ingesters-within
setting. #3035 - [BUGFIX] Blocks storage: Ingester is less likely to hit gRPC message size limit when streaming data to queriers. #3015
- [BUGFIX] Blocks storage: fixed memberlist support for the store-gateways and compactors ring used when blocks sharding is enabled. #3058 #3095
- [BUGFIX] Fix configuration for TLS server validation, TLS skip verify was hardcoded to true for all TLS configurations and prevented validation of server certificates. #3030
- [BUGFIX] Fixes the Alertmanager panicking when no
-alertmanager.web.external-url
is provided. #3017 - [BUGFIX] Fixes the registration of the Alertmanager API metrics
cortex_alertmanager_alerts_received_total
andcortex_alertmanager_alerts_invalid_total
. #3065 - [BUGFIX] Fixes
flag needs an argument: -config.expand-env
error. #3087 - [BUGFIX] An index optimisation actually slows things down when using caching. Moved it to the right location. #2973
- [BUGFIX] Ingester: If push request contained both valid and invalid samples, valid samples were ingested but not stored to WAL of the chunks storage. This has been fixed. #3067
- [BUGFIX] Cassandra: fixed consistency setting in the CQL session when creating the keyspace. #3105
- [BUGFIX] Ruler: Config API would return both the
record
andalert
inYAML
response keys even when one of them must be empty. #3120 - [BUGFIX] Index page now uses configured HTTP path prefix when creating links. #3126
- [BUGFIX] Purger: fixed deadlock when reloading of tombstones failed. #3182
- [BU...
Cortex 1.4.0-rc.1
This is the second release candidate for Cortex 1.4.0
.
Changelog
- [CHANGE] TLS configuration for gRPC, HTTP and etcd clients is now marked as experimental. These features are not yet fully baked, and we expect possible small breaking changes in Cortex 1.5. #3198
- [CHANGE] Fixed store-gateway CLI flags inconsistencies. #3201
-store-gateway.replication-factor
flag renamed to-store-gateway.sharding-ring.replication-factor
-store-gateway.tokens-file-path
flag renamed tostore-gateway.sharding-ring.tokens-file-path
- [BUGFIX] Handle hash-collisions in the query path. Before this fix, Cortex could occasionally mix up two different series in a query, leading to invalid results, when
-querier.ingester-streaming
was used. #3192
Cortex 1.4.0-rc.0
This Cortex releases features 112 contributions from 32 authors and exciting news!
Highlights
- Cortex blocks storage is now GA.
- Cassandra support for the chunks storage is now GA.
- Redis caching backend now supports Redis sentinel and Redis cluster too.
- Introduced shuffle sharding support to store-gateway blocks sharding (blocks storage).
- The ruler and alertmanager got several improvements
- Last, but not the least, many enhancements, optimisations and bug fixes.
Please refer to the changelog for full list of changes and improvements.
Changelog
- [CHANGE] Cassandra backend support is now GA (stable). #3180
- [CHANGE] Blocks storage is now GA (stable). The
-experimental
prefix has been removed from all CLI flags related to the blocks storage (no YAML config changes). #3180-experimental.blocks-storage.*
flags renamed to-blocks-storage.*
-experimental.store-gateway.*
flags renamed to-store-gateway.*
-experimental.querier.store-gateway-client.*
flags renamed to-querier.store-gateway-client.*
-experimental.querier.store-gateway-addresses
flag renamed to-querier.store-gateway-addresses
- [CHANGE] Ingester: Removed deprecated untyped record from chunks WAL. Only if you are running
v1.0
or below, it is recommended to first upgrade tov1.1
/v1.2
/v1.3
and run it for a day before upgrading tov1.4
to avoid data loss. #3115 - [CHANGE] Distributor API endpoints are no longer served unless target is set to
distributor
orall
. #3112 - [CHANGE] Increase the default Cassandra client replication factor to 3. #3007
- [CHANGE] Blocks storage: removed the support to transfer blocks between ingesters on shutdown. When running the Cortex blocks storage, ingesters are expected to run with a persistent disk. The following metrics have been removed: #2996
cortex_ingester_sent_files
cortex_ingester_received_files
cortex_ingester_received_bytes_total
cortex_ingester_sent_bytes_total
- [CHANGE] The buckets for the
cortex_chunk_store_index_lookups_per_query
metric have been changed to 1, 2, 4, 8, 16. #3021 - [CHANGE] Blocks storage: the
operation
label valuegetrange
has changed intoget_range
for the metricsthanos_store_bucket_cache_operation_requests_total
andthanos_store_bucket_cache_operation_hits_total
. #3000 - [CHANGE] Experimental Delete Series:
/api/v1/admin/tsdb/delete_series
and/api/v1/admin/tsdb/cancel_delete_request
purger APIs to return status code204
instead of200
for success. #2946 - [CHANGE] Histogram
cortex_memcache_request_duration_seconds
method
label value changes fromMemcached.Get
toMemcached.GetBatched
for batched lookups, and is not reported for non-batched lookups (label valueMemcached.GetMulti
remains, and had exactly the same value asGet
in nonbatched lookups). The same change applies to tracing spans. #3046 - [CHANGE] TLS server validation is now enabled by default, a new parameter
tls_insecure_skip_verify
can be set to true to skip validation optionally. #3030 - [CHANGE]
cortex_ruler_config_update_failures_total
has been removed in favor ofcortex_ruler_config_last_reload_successful
. #3056 - [CHANGE]
ruler.evaluation_delay_duration
field in YAML config has been moved and renamed tolimits.ruler_evaluation_delay_duration
. #3098 - [CHANGE] Removed obsolete
results_cache.max_freshness
from YAML config (deprecated since Cortex 1.2). #3145 - [CHANGE] Removed obsolete
-promql.lookback-delta
option (deprecated since Cortex 1.2, replaced with-querier.lookback-delta
). #3144 - [CHANGE] Cache: added support for Redis Cluster and Redis Sentinel. #2961
- The following changes have been made in Redis configuration:
-redis.master_name
added-redis.db
added-redis.max-active-conns
changed to-redis.pool-size
-redis.max-conn-lifetime
changed to-redis.max-connection-age
-redis.max-idle-conns
removed-redis.wait-on-pool-exhaustion
removed
- [FEATURE] Logging of the source IP passed along by a reverse proxy is now supported by setting the
-server.log-source-ips-enabled
. For non standard headers the settings-server.log-source-ips-header
and-server.log-source-ips-regex
can be used. #2985 - [FEATURE] Blocks storage: added shuffle sharding support to store-gateway blocks sharding. Added the following additional metrics to store-gateway: #3069
cortex_bucket_stores_tenants_discovered
cortex_bucket_stores_tenants_synced
- [FEATURE] Experimental blocksconvert: introduce an experimental tool
blocksconvert
to migrate long-term storage chunks to blocks. #3092 #3122 #3127 #3162 - [ENHANCEMENT] Add support for azure storage in China, German and US Government environments. #2988
- [ENHANCEMENT] Query-tee: added a small tolerance to floating point sample values comparison. #2994
- [ENHANCEMENT] Query-tee: add support for doing a passthrough of requests to preferred backend for unregistered routes #3018
- [ENHANCEMENT] Expose
storage.aws.dynamodb.backoff_config
configuration file field. #3026 - [ENHANCEMENT] Added
cortex_request_message_bytes
andcortex_response_message_bytes
histograms to track received and sent gRPC message and HTTP request/response sizes. Addedcortex_inflight_requests
gauge to track number of inflight gRPC and HTTP requests. #3064 - [ENHANCEMENT] Publish ruler's ring metrics. #3074
- [ENHANCEMENT] Add config validation to the experimental Alertmanager API. Invalid configs are no longer accepted. #3053
- [ENHANCEMENT] Add "integration" as a label for
cortex_alertmanager_notifications_total
andcortex_alertmanager_notifications_failed_total
metrics. #3056 - [ENHANCEMENT] Add
cortex_ruler_config_last_reload_successful
andcortex_ruler_config_last_reload_successful_seconds
to check status of users rule manager. #3056 - [ENHANCEMENT] The configuration validation now fails if an empty YAML node has been set for a root YAML config property. #3080
- [ENHANCEMENT] Memcached dial() calls now have a circuit-breaker to avoid hammering a broken cache. #3051, #3189
- [ENHANCEMENT]
-ruler.evaluation-delay-duration
is now overridable as a per-tenant limit,ruler_evaluation_delay_duration
. #3098 - [ENHANCEMENT] Add TLS support to etcd client. #3102
- [ENHANCEMENT] When a tenant accesses the Alertmanager UI or its API, if we have valid
-alertmanager.configs.fallback
we'll use that to start the manager and avoid failing the request. #3073 - [ENHANCEMENT] Add
DELETE api/v1/rules/{namespace}
to the Ruler. It allows all the rule groups of a namespace to be deleted. #3120 - [ENHANCEMENT] Experimental Delete Series: Retry processing of Delete requests during failures. #2926
- [ENHANCEMENT] Improve performance of QueryStream() in ingesters. #3177
- [ENHANCEMENT] Modules included in "All" target are now visible in output of
-modules
CLI flag. #3155 - [ENHANCEMENT] Added
/debug/fgprof
endpoint to debug running Cortex process usingfgprof
. This adds up to the existing/debug/...
endpoints. #3131 - [ENHANCEMENT] Blocks storage: optimised
/api/v1/series
for blocks storage. (#2976) - [BUGFIX] Ruler: when loading rules from "local" storage, check for directory after resolving symlink. #3137
- [BUGFIX] Query-frontend: Fixed rounding for incoming query timestamps, to be 100% Prometheus compatible. #2990
- [BUGFIX] Querier: Merge results from chunks and blocks ingesters when using streaming of results. #3013
- [BUGFIX] Querier: query /series from ingesters regardless the
-querier.query-ingesters-within
setting. #3035 - [BUGFIX] Blocks storage: Ingester is less likely to hit gRPC message size limit when streaming data to queriers. #3015
- [BUGFIX] Blocks storage: fixed memberlist support for the store-gateways and compactors ring used when blocks sharding is enabled. #3058 #3095
- [BUGFIX] Fix configuration for TLS server validation, TLS skip verify was hardcoded to true for all TLS configurations and prevented validation of server certificates. #3030
- [BUGFIX] Fixes the Alertmanager panicking when no
-alertmanager.web.external-url
is provided. #3017 - [BUGFIX] Fixes the registration of the Alertmanager API metrics
cortex_alertmanager_alerts_received_total
andcortex_alertmanager_alerts_invalid_total
. #3065 - [BUGFIX] Fixes
flag needs an argument: -config.expand-env
error. #3087 - [BUGFIX] An index optimisation actually slows things down when using caching. Moved it to the right location. #2973
- [BUGFIX] Ingester: If push request contained both valid and invalid samples, valid samples were ingested but not stored to WAL of the chunks storage. This has been fixed. #3067
- [BUGFIX] Cassandra: fixed consistency setting in the CQL session when creating the keyspace. #3105
- [BUGFIX] Ruler: Config API would return both the
record
andalert
inYAML
response keys even when one of them must be empty. #3120 - [BUGFIX] Index page now uses configured HTTP path prefix when creating links. #3126
- [BUGFIX] Purger: fixed deadlock when reloading of tombstones failed. #3182
- [BUGFIX] Fixed panic in flusher job, when error writing chunks to the store would cause "idle" chunks to be flushed, which triggered panic. #3140
- [BUGFIX] Index page no longer shows links that are not valid for running Cortex instance. #3133
- [BUGFIX] Configs: prevent validation of templates to fail when using template functions. #3157
- [BUGFIX] Configuring the S3 URL with an
@
but without username and password doesn't enable the AWS static credentials anymore. #3170 - [BUGFIX] Limit errors on ranged queries (
api/v1/query_range
) no longer return a status code500
but422
instead. #3167
Cortex 1.3.0
This Cortex release features 125 contributions from 37 different authors. It's yet another great milestone we have reached thanks to the amazing support from our community ❤️ Thanks!
Highlights:
- The blocks storage is getting closer to production readiness. In this release we've done several fixes and improvements. In particular, you should be aware of:
- Some CLI flags and YAML config options have been renamed
- The store-gateway service is now mandatory when running the blocks storage
- Introduced support for a live cluster migration from chunks to blocks (and rollback)
- Introduced support to flush blocks on-demand from ingesters
- The ruler and alertmanager got several improvements, including but not limited to:
- The ruler now runs in the single binary when Cortex gets started with
-target=all
- Introduced new config options to fine-tune the ruler
- Introduced support to load locally stored rules (eg. loaded via Kubernetes config map)
- Multiple alertmanager URLs can now be specified in the ruler; each URL is treated as a separate alertmanager group
- Alertmanager configuration can be persisted to object storage via API
- The ruler now runs in the single binary when Cortex gets started with
- Other changes worth to note:
- Added optional
snappy
compression support to internal gRPC connections - Starting from this release we're going to publish
.rpm
and.deb
packages too
- Added optional
Please refer to the full changelog for full list of changes and improvements.
Changelog
- [CHANGE] Replace the metric
cortex_alertmanager_configs
withcortex_alertmanager_config_invalid
exposed by Alertmanager. #2960 - [CHANGE] Experimental Delete Series: Change target flag for purger from
data-purger
topurger
. #2777 - [CHANGE] Experimental blocks storage: The max concurrent queries against the long-term storage, configured via
-experimental.blocks-storage.bucket-store.max-concurrent
, is now a limit shared across all tenants and not a per-tenant limit anymore. The default value has changed from20
to100
and the following new metrics have been added: #2797cortex_bucket_stores_gate_queries_concurrent_max
cortex_bucket_stores_gate_queries_in_flight
cortex_bucket_stores_gate_duration_seconds
- [CHANGE] Metric
cortex_ingester_flush_reasons
has been renamed tocortex_ingester_flushing_enqueued_series_total
, and new metriccortex_ingester_flushing_dequeued_series_total
withoutcome
label (superset of reason) has been added. #2802 #2818 #2998 - [CHANGE] Experimental Delete Series: Metric
cortex_purger_oldest_pending_delete_request_age_seconds
would track age of delete requests since they are over their cancellation period instead of their creation time. #2806 - [CHANGE] Experimental blocks storage: the store-gateway service is required in a Cortex cluster running with the experimental blocks storage. Removed the
-experimental.tsdb.store-gateway-enabled
CLI flag andstore_gateway_enabled
YAML config option. The store-gateway is now always enabled when the storage engine isblocks
. #2822 - [CHANGE] Experimental blocks storage: removed support for
-experimental.blocks-storage.bucket-store.max-sample-count
flag because the implementation was flawed. To limit the number of samples/chunks processed by a single query you can set-store.query-chunk-limit
, which is now supported by the blocks storage too. #2852 - [CHANGE] Ingester: Chunks flushed via /flush stay in memory until retention period is reached. This affects
cortex_ingester_memory_chunks
metric. #2778 - [CHANGE] Querier: the error message returned when the query time range exceeds
-store.max-query-length
has changed frominvalid query, length > limit (X > Y)
tothe query time range exceeds the limit (query length: X, limit: Y)
. #2826 - [CHANGE] Add
component
label to metrics exposed by chunk, delete and index store clients. #2774 - [CHANGE] Querier: when
-querier.query-ingesters-within
is configured, the time range of the query sent to ingesters is now manipulated to ensure the query start time is not older than 'now - query-ingesters-within'. #2904 - [CHANGE] KV: The
role
label which was a label ofmulti
KV store client only has been added to metrics of every KV store client. If KV store client is notmulti
, then the value ofrole
label isprimary
. #2837 - [CHANGE] Added the
engine
label to the metrics exposed by the Prometheus query engine, to distinguish betweenruler
andquerier
metrics. #2854 - [CHANGE] Added ruler to the single binary when started with
-target=all
(default). #2854 - [CHANGE] Experimental blocks storage: compact head when opening TSDB. This should only affect ingester startup after it was unable to compact head in previous run. #2870
- [CHANGE] Metric
cortex_overrides_last_reload_successful
has been renamed tocortex_runtime_config_last_reload_successful
. #2874 - [CHANGE] HipChat support has been removed from the alertmanager (because removed from the Prometheus upstream too). #2902
- [CHANGE] Add constant label
name
to metriccortex_cache_request_duration_seconds
. #2903 - [CHANGE] Add
user
label to metriccortex_query_frontend_queue_length
. #2939 - [CHANGE] Experimental blocks storage: cleaned up the config and renamed "TSDB" to "blocks storage". #2937
- The storage engine setting value has been changed from
tsdb
toblocks
; this affects-store.engine
CLI flag and its respective YAML option. - The root level YAML config has changed from
tsdb
toblocks_storage
- The prefix of all CLI flags has changed from
-experimental.tsdb.
to-experimental.blocks-storage.
- The following settings have been grouped under
tsdb
property in the YAML config and their CLI flags changed:-experimental.tsdb.dir
changed to-experimental.blocks-storage.tsdb.dir
-experimental.tsdb.block-ranges-period
changed to-experimental.blocks-storage.tsdb.block-ranges-period
-experimental.tsdb.retention-period
changed to-experimental.blocks-storage.tsdb.retention-period
-experimental.tsdb.ship-interval
changed to-experimental.blocks-storage.tsdb.ship-interval
-experimental.tsdb.ship-concurrency
changed to-experimental.blocks-storage.tsdb.ship-concurrency
-experimental.tsdb.max-tsdb-opening-concurrency-on-startup
changed to-experimental.blocks-storage.tsdb.max-tsdb-opening-concurrency-on-startup
-experimental.tsdb.head-compaction-interval
changed to-experimental.blocks-storage.tsdb.head-compaction-interval
-experimental.tsdb.head-compaction-concurrency
changed to-experimental.blocks-storage.tsdb.head-compaction-concurrency
-experimental.tsdb.head-compaction-idle-timeout
changed to-experimental.blocks-storage.tsdb.head-compaction-idle-timeout
-experimental.tsdb.stripe-size
changed to-experimental.blocks-storage.tsdb.stripe-size
-experimental.tsdb.wal-compression-enabled
changed to-experimental.blocks-storage.tsdb.wal-compression-enabled
-experimental.tsdb.flush-blocks-on-shutdown
changed to-experimental.blocks-storage.tsdb.flush-blocks-on-shutdown
- The storage engine setting value has been changed from
- [CHANGE] Flags
-bigtable.grpc-use-gzip-compression
,-ingester.client.grpc-use-gzip-compression
,-querier.frontend-client.grpc-use-gzip-compression
are now deprecated. #2940 - [CHANGE] Limit errors reported by ingester during query-time now return HTTP status code 422. #2941
- [FEATURE] Introduced
ruler.for-outage-tolerance
, Max time to tolerate outage for restoring "for" state of alert. #2783 - [FEATURE] Introduced
ruler.for-grace-period
, Minimum duration between alert and restored "for" state. This is maintained only for alerts with configured "for" time greater than grace period. #2783 - [FEATURE] Introduced
ruler.resend-delay
, Minimum amount of time to wait before resending an alert to Alertmanager. #2783 - [FEATURE] Ruler: added
local
filesystem support to store rules (read-only). #2854 - [ENHANCEMENT] Upgraded Docker base images to
alpine:3.12
. #2862 - [ENHANCEMENT] Experimental: Querier can now optionally query secondary store. This is specified by using
-querier.second-store-engine
option, with valueschunks
orblocks
. Standard configuration options for this store are used. Additionally, this querying can be configured to happen only for queries that need data older than-querier.use-second-store-before-time
. Default value of zero will always query secondary store. #2747 - [ENHANCEMENT] Query-tee: increased the
cortex_querytee_request_duration_seconds
metric buckets granularity. #2799 - [ENHANCEMENT] Query-tee: fail to start if the configured
-backend.preferred
is unknown. #2799 - [ENHANCEMENT] Ruler: Added the following metrics: #2786
cortex_prometheus_notifications_latency_seconds
cortex_prometheus_notifications_errors_total
cortex_prometheus_notifications_sent_total
cortex_prometheus_notifications_dropped_total
cortex_prometheus_notifications_queue_length
cortex_prometheus_notifications_queue_capacity
cortex_prometheus_notifications_alertmanagers_discovered
- [ENHANCEMENT] The behavior of the
/ready
was changed for the query frontend to indicate when it was ready to accept queries. This is intended for use by a read path load balancer that would want to wait for the frontend to have attached queriers before including it in the backend. #2733 - [ENHANCEMENT] Experimental Delete Series: Add support for deletion of chunks for remaining stores. #2801
- [ENHANCEMENT] Add
-modules
command line flag to list possible values for-target
. Also, log warning if given target is internal component. #2752 - [ENHANCEMENT] Added
-ingester.flush-on-shutdown-with-wal-enabled
option to enable chunks flushing even when WAL is enabled. #2780 - [ENHANCEMENT] Query-tee: Support for custom API prefix by using
-server.path-prefix
option. #2814 - [ENHANCEMENT] Query-tee: Forwar...
Cortex 1.3.0-rc.2
This is the third release candidate for Cortex 1.3.0, including a bug fix:
- [BUGFIX] Querier: query /series from ingesters regardless the
-querier.query-ingesters-within
setting. #3035
Cortex 1.3.0-rc.1
This is the second release candidate for Cortex 1.3.0
, including a bug fix and an improvement:
Cortex 1.3.0-rc.0
This Cortex release features 125 contributions from 37 different authors. It's yet another great milestone we have reached thanks to the amazing support from our community ❤️ Thanks!
Highlights:
- The blocks storage is getting closer to production readiness. In this release we've done several fixes and improvements. In particular, you should be aware of:
- Some CLI flags and YAML config options have been renamed
- The store-gateway service is now mandatory when running the blocks storage
- Introduced support for a live cluster migration from chunks to blocks (and rollback)
- Introduced support to flush blocks on-demand from ingesters
- The ruler and alertmanager got several improvements, including but not limited to:
- The ruler now runs in the single binary when Cortex gets started with
-target=all
- Introduced new config options to fine-tune the ruler
- Introduced support to load locally stored rules (eg. loaded via Kubernetes config map)
- Multiple alertmanager URLs can now be specified in the ruler; each URL is treated as a separate alertmanager group
- Alertmanager configuration can be persisted to object storage via API
- The ruler now runs in the single binary when Cortex gets started with
- Other changes worth to note:
- Added optional
snappy
compression support to internal gRPC connections - Starting from this release we're going to publish
.rpm
and.deb
packages too
- Added optional
Please refer to the full changelog for full list of changes and improvements.
Changelog
- [CHANGE] Replace the metric
cortex_alertmanager_configs
withcortex_alertmanager_config_invalid
exposed by Alertmanager. #2960 - [CHANGE] Experimental Delete Series: Change target flag for purger from
data-purger
topurger
. #2777 - [CHANGE] Experimental blocks storage: The max concurrent queries against the long-term storage, configured via
-experimental.blocks-storage.bucket-store.max-concurrent
, is now a limit shared across all tenants and not a per-tenant limit anymore. The default value has changed from20
to100
and the following new metrics have been added: #2797cortex_bucket_stores_gate_queries_concurrent_max
cortex_bucket_stores_gate_queries_in_flight
cortex_bucket_stores_gate_duration_seconds
- [CHANGE] Metric
cortex_ingester_flush_reasons
has been renamed tocortex_ingester_flushing_enqueued_series_total
, and new metriccortex_ingester_flushing_dequeued_series_total
withoutcome
label (superset of reason) has been added. #2802, #2818 - [CHANGE] Experimental Delete Series: Metric
cortex_purger_oldest_pending_delete_request_age_seconds
would track age of delete requests since they are over their cancellation period instead of their creation time. #2806 - [CHANGE] Experimental blocks storage: the store-gateway service is required in a Cortex cluster running with the experimental blocks storage. Removed the
-experimental.tsdb.store-gateway-enabled
CLI flag andstore_gateway_enabled
YAML config option. The store-gateway is now always enabled when the storage engine isblocks
. #2822 - [CHANGE] Experimental blocks storage: removed support for
-experimental.blocks-storage.bucket-store.max-sample-count
flag because the implementation was flawed. To limit the number of samples/chunks processed by a single query you can set-store.query-chunk-limit
, which is now supported by the blocks storage too. #2852 - [CHANGE] Ingester: Chunks flushed via /flush stay in memory until retention period is reached. This affects
cortex_ingester_memory_chunks
metric. #2778 - [CHANGE] Querier: the error message returned when the query time range exceeds
-store.max-query-length
has changed frominvalid query, length > limit (X > Y)
tothe query time range exceeds the limit (query length: X, limit: Y)
. #2826 - [CHANGE] Add
component
label to metrics exposed by chunk, delete and index store clients. #2774 - [CHANGE] Querier: when
-querier.query-ingesters-within
is configured, the time range of the query sent to ingesters is now manipulated to ensure the query start time is not older than 'now - query-ingesters-within'. #2904 - [CHANGE] KV: The
role
label which was a label ofmulti
KV store client only has been added to metrics of every KV store client. If KV store client is notmulti
, then the value ofrole
label isprimary
. #2837 - [CHANGE] Added the
engine
label to the metrics exposed by the Prometheus query engine, to distinguish betweenruler
andquerier
metrics. #2854 - [CHANGE] Added ruler to the single binary when started with
-target=all
(default). #2854 - [CHANGE] Experimental blocks storage: compact head when opening TSDB. This should only affect ingester startup after it was unable to compact head in previous run. #2870
- [CHANGE] Metric
cortex_overrides_last_reload_successful
has been renamed tocortex_runtime_config_last_reload_successful
. #2874 - [CHANGE] HipChat support has been removed from the alertmanager (because removed from the Prometheus upstream too). #2902
- [CHANGE] Add constant label
name
to metriccortex_cache_request_duration_seconds
. #2903 - [CHANGE] Add
user
label to metriccortex_query_frontend_queue_length
. #2939 - [CHANGE] Experimental blocks storage: cleaned up the config and renamed "TSDB" to "blocks storage". #2937
- The storage engine setting value has been changed from
tsdb
toblocks
; this affects-store.engine
CLI flag and its respective YAML option. - The root level YAML config has changed from
tsdb
toblocks_storage
- The prefix of all CLI flags has changed from
-experimental.tsdb.
to-experimental.blocks-storage.
- The following settings have been grouped under
tsdb
property in the YAML config and their CLI flags changed:-experimental.tsdb.dir
changed to-experimental.blocks-storage.tsdb.dir
-experimental.tsdb.block-ranges-period
changed to-experimental.blocks-storage.tsdb.block-ranges-period
-experimental.tsdb.retention-period
changed to-experimental.blocks-storage.tsdb.retention-period
-experimental.tsdb.ship-interval
changed to-experimental.blocks-storage.tsdb.ship-interval
-experimental.tsdb.ship-concurrency
changed to-experimental.blocks-storage.tsdb.ship-concurrency
-experimental.tsdb.max-tsdb-opening-concurrency-on-startup
changed to-experimental.blocks-storage.tsdb.max-tsdb-opening-concurrency-on-startup
-experimental.tsdb.head-compaction-interval
changed to-experimental.blocks-storage.tsdb.head-compaction-interval
-experimental.tsdb.head-compaction-concurrency
changed to-experimental.blocks-storage.tsdb.head-compaction-concurrency
-experimental.tsdb.head-compaction-idle-timeout
changed to-experimental.blocks-storage.tsdb.head-compaction-idle-timeout
-experimental.tsdb.stripe-size
changed to-experimental.blocks-storage.tsdb.stripe-size
-experimental.tsdb.wal-compression-enabled
changed to-experimental.blocks-storage.tsdb.wal-compression-enabled
-experimental.tsdb.flush-blocks-on-shutdown
changed to-experimental.blocks-storage.tsdb.flush-blocks-on-shutdown
- The storage engine setting value has been changed from
- [CHANGE] Flags
-bigtable.grpc-use-gzip-compression
,-ingester.client.grpc-use-gzip-compression
,-querier.frontend-client.grpc-use-gzip-compression
are now deprecated. #2940 - [CHANGE] Limit errors reported by ingester during query-time now return HTTP status code 422. #2941
- [FEATURE] Introduced
ruler.for-outage-tolerance
, Max time to tolerate outage for restoring "for" state of alert. #2783 - [FEATURE] Introduced
ruler.for-grace-period
, Minimum duration between alert and restored "for" state. This is maintained only for alerts with configured "for" time greater than grace period. #2783 - [FEATURE] Introduced
ruler.resend-delay
, Minimum amount of time to wait before resending an alert to Alertmanager. #2783 - [FEATURE] Ruler: added
local
filesystem support to store rules (read-only). #2854 - [ENHANCEMENT] Upgraded Docker base images to
alpine:3.12
. #2862 - [ENHANCEMENT] Experimental: Querier can now optionally query secondary store. This is specified by using
-querier.second-store-engine
option, with valueschunks
orblocks
. Standard configuration options for this store are used. Additionally, this querying can be configured to happen only for queries that need data older than-querier.use-second-store-before-time
. Default value of zero will always query secondary store. #2747 - [ENHANCEMENT] Query-tee: increased the
cortex_querytee_request_duration_seconds
metric buckets granularity. #2799 - [ENHANCEMENT] Query-tee: fail to start if the configured
-backend.preferred
is unknown. #2799 - [ENHANCEMENT] Ruler: Added the following metrics: #2786
cortex_prometheus_notifications_latency_seconds
cortex_prometheus_notifications_errors_total
cortex_prometheus_notifications_sent_total
cortex_prometheus_notifications_dropped_total
cortex_prometheus_notifications_queue_length
cortex_prometheus_notifications_queue_capacity
cortex_prometheus_notifications_alertmanagers_discovered
- [ENHANCEMENT] The behavior of the
/ready
was changed for the query frontend to indicate when it was ready to accept queries. This is intended for use by a read path load balancer that would want to wait for the frontend to have attached queriers before including it in the backend. #2733 - [ENHANCEMENT] Experimental Delete Series: Add support for deletion of chunks for remaining stores. #2801
- [ENHANCEMENT] Add
-modules
command line flag to list possible values for-target
. Also, log warning if given target is internal component. #2752 - [ENHANCEMENT] Added
-ingester.flush-on-shutdown-with-wal-enabled
option to enable chunks flushing even when WAL is enabled. #2780 - [ENHANCEMENT] Query-tee: Support for custom API prefix by using
-server.path-prefix
option. #2814 - [ENHANCEMENT] Query-tee: Forward `X-...
Cortex 1.2.0
This release has a number of bug-fixes and enhancements, particularly:
- Memberlist KV client is no longer considered experimental. #2725
- 3rd-party index and chunk stores using gRPC client/server plugin mechanism (experimental) #2220
- Using an invalid flag no longer causes printing of all available flags. #2691 (my favourite change!)
Many thanks to all contributors.
Detailed list of changes:
- [CHANGE] Metric
cortex_kv_request_duration_seconds
now includesname
label to denote which client is being used as well as thebackend
label to denote the KV backend implementation in use. #2648 - [CHANGE] Experimental Ruler: Rule groups persisted to object storage using the experimental API have an updated object key encoding to better handle special characters. Rule groups previously-stored using object storage must be renamed to the new format. #2646
- [CHANGE] Query Frontend now uses Round Robin to choose a tenant queue to service next. #2553
- [CHANGE]
-promql.lookback-delta
is now deprecated and has been replaced by-querier.lookback-delta
along withlookback_delta
entry underquerier
in the config file.-promql.lookback-delta
will be removed in v1.4.0. #2604 - [CHANGE] Experimental TSDB: removed
-experimental.tsdb.bucket-store.binary-index-header-enabled
flag. Now the binary index-header is always enabled. - [CHANGE] Experimental TSDB: Renamed index-cache metrics to use original metric names from Thanos, as Cortex is not aggregating them in any way: #2627
cortex_<service>_blocks_index_cache_items_evicted_total
=>thanos_store_index_cache_items_evicted_total{name="index-cache"}
cortex_<service>_blocks_index_cache_items_added_total
=>thanos_store_index_cache_items_added_total{name="index-cache"}
cortex_<service>_blocks_index_cache_requests_total
=>thanos_store_index_cache_requests_total{name="index-cache"}
cortex_<service>_blocks_index_cache_items_overflowed_total
=>thanos_store_index_cache_items_overflowed_total{name="index-cache"}
cortex_<service>_blocks_index_cache_hits_total
=>thanos_store_index_cache_hits_total{name="index-cache"}
cortex_<service>_blocks_index_cache_items
=>thanos_store_index_cache_items{name="index-cache"}
cortex_<service>_blocks_index_cache_items_size_bytes
=>thanos_store_index_cache_items_size_bytes{name="index-cache"}
cortex_<service>_blocks_index_cache_total_size_bytes
=>thanos_store_index_cache_total_size_bytes{name="index-cache"}
cortex_<service>_blocks_index_cache_memcached_operations_total
=>thanos_memcached_operations_total{name="index-cache"}
cortex_<service>_blocks_index_cache_memcached_operation_failures_total
=>thanos_memcached_operation_failures_total{name="index-cache"}
cortex_<service>_blocks_index_cache_memcached_operation_duration_seconds
=>thanos_memcached_operation_duration_seconds{name="index-cache"}
cortex_<service>_blocks_index_cache_memcached_operation_skipped_total
=>thanos_memcached_operation_skipped_total{name="index-cache"}
- [CHANGE] Experimental TSDB: Renamed metrics in bucket stores: #2627
cortex_<service>_blocks_meta_syncs_total
=>cortex_blocks_meta_syncs_total{component="<service>"}
cortex_<service>_blocks_meta_sync_failures_total
=>cortex_blocks_meta_sync_failures_total{component="<service>"}
cortex_<service>_blocks_meta_sync_duration_seconds
=>cortex_blocks_meta_sync_duration_seconds{component="<service>"}
cortex_<service>_blocks_meta_sync_consistency_delay_seconds
=>cortex_blocks_meta_sync_consistency_delay_seconds{component="<service>"}
cortex_<service>_blocks_meta_synced
=>cortex_blocks_meta_synced{component="<service>"}
cortex_<service>_bucket_store_block_loads_total
=>cortex_bucket_store_block_loads_total{component="<service>"}
cortex_<service>_bucket_store_block_load_failures_total
=>cortex_bucket_store_block_load_failures_total{component="<service>"}
cortex_<service>_bucket_store_block_drops_total
=>cortex_bucket_store_block_drops_total{component="<service>"}
cortex_<service>_bucket_store_block_drop_failures_total
=>cortex_bucket_store_block_drop_failures_total{component="<service>"}
cortex_<service>_bucket_store_blocks_loaded
=>cortex_bucket_store_blocks_loaded{component="<service>"}
cortex_<service>_bucket_store_series_data_touched
=>cortex_bucket_store_series_data_touched{component="<service>"}
cortex_<service>_bucket_store_series_data_fetched
=>cortex_bucket_store_series_data_fetched{component="<service>"}
cortex_<service>_bucket_store_series_data_size_touched_bytes
=>cortex_bucket_store_series_data_size_touched_bytes{component="<service>"}
cortex_<service>_bucket_store_series_data_size_fetched_bytes
=>cortex_bucket_store_series_data_size_fetched_bytes{component="<service>"}
cortex_<service>_bucket_store_series_blocks_queried
=>cortex_bucket_store_series_blocks_queried{component="<service>"}
cortex_<service>_bucket_store_series_get_all_duration_seconds
=>cortex_bucket_store_series_get_all_duration_seconds{component="<service>"}
cortex_<service>_bucket_store_series_merge_duration_seconds
=>cortex_bucket_store_series_merge_duration_seconds{component="<service>"}
cortex_<service>_bucket_store_series_refetches_total
=>cortex_bucket_store_series_refetches_total{component="<service>"}
cortex_<service>_bucket_store_series_result_series
=>cortex_bucket_store_series_result_series{component="<service>"}
cortex_<service>_bucket_store_cached_postings_compressions_total
=>cortex_bucket_store_cached_postings_compressions_total{component="<service>"}
cortex_<service>_bucket_store_cached_postings_compression_errors_total
=>cortex_bucket_store_cached_postings_compression_errors_total{component="<service>"}
cortex_<service>_bucket_store_cached_postings_compression_time_seconds
=>cortex_bucket_store_cached_postings_compression_time_seconds{component="<service>"}
cortex_<service>_bucket_store_cached_postings_original_size_bytes_total
=>cortex_bucket_store_cached_postings_original_size_bytes_total{component="<service>"}
cortex_<service>_bucket_store_cached_postings_compressed_size_bytes_total
=>cortex_bucket_store_cached_postings_compressed_size_bytes_total{component="<service>"}
cortex_<service>_blocks_sync_seconds
=>cortex_bucket_stores_blocks_sync_seconds{component="<service>"}
cortex_<service>_blocks_last_successful_sync_timestamp_seconds
=>cortex_bucket_stores_blocks_last_successful_sync_timestamp_seconds{component="<service>"}
- [CHANGE] Available command-line flags are printed to stdout, and only when requested via
-help
. Using invalid flag no longer causes printing of all available flags. #2691 - [CHANGE] Experimental Memberlist ring: randomize gossip node names to avoid conflicts when running multiple clients on the same host, or reusing host names (eg. pods in statefulset). Node name randomization can be disabled by using
-memberlist.randomize-node-name=false
. #2715 - [CHANGE] Memberlist KV client is no longer considered experimental. #2725
- [CHANGE] Experimental Delete Series: Make delete request cancellation duration configurable. #2760
- [CHANGE] Removed
-store.fullsize-chunks
option which was undocumented and unused (it broke ingester hand-overs). #2656 - [CHANGE] Query with no metric name that has previously resulted in HTTP status code 500 now returns status code 422 instead. #2571
- [FEATURE] TLS config options added for GRPC clients in Querier (Query-frontend client & Ingester client), Ruler, Store Gateway, as well as HTTP client in Config store client. #2502
- [FEATURE] The flag
-frontend.max-cache-freshness
is now supported within the limits overrides, to specify per-tenant max cache freshness values. The corresponding YAML config parameter has been changed fromresults_cache.max_freshness
tolimits_config.max_cache_freshness
. The legacy YAML config parameter (results_cache.max_freshness
) will continue to be supported till Cortex releasev1.4.0
. #2609 - [FEATURE] Experimental gRPC Store: Added support to 3rd parties index and chunk stores using gRPC client/server plugin mechanism. #2220
- [FEATURE] Add
-cassandra.table-options
flag to customize table options of Cassandra when creating the index or chunk table. #2575 - [ENHANCEMENT] Propagate GOPROXY value when building
build-image
. This is to help the builders building the code in a Network where default Go proxy is not accessible (e.g. when behind some corporate VPN). #2741 - [ENHANCEMENT] Querier: Added metric
cortex_querier_request_duration_seconds
for all requests to the querier. #2708 - [ENHANCEMENT] Cortex is now built with Go 1.14. #2480 #2749 #2753
- [ENHANCEMENT] Experimental TSDB: added the following metrics to the ingester: #2580 #2583 #2589 #2654
cortex_ingester_tsdb_appender_add_duration_seconds
cortex_ingester_tsdb_appender_commit_duration_seconds
cortex_ingester_tsdb_refcache_purge_duration_seconds
cortex_ingester_tsdb_compactions_total
cortex_ingester_tsdb_compaction_duration_seconds
cortex_ingester_tsdb_wal_fsync_duration_seconds
cortex_ingester_tsdb_wal_page_flushes_total
cortex_ingester_tsdb_wal_completed_pages_total
cortex_ingester_tsdb_wal_truncations_failed_total
cortex_ingester_tsdb_wal_truncations_total
cortex_ingester_tsdb_wal_writes_failed_total
cortex_ingester_tsdb_checkpoint_deletions_failed_total
cortex_ingester_tsdb_checkpoint_deletions_total
cortex_ingester_tsdb_checkpoint_creations_failed_total
cortex_ingester_tsdb_checkpoint_creations_total
cortex_ingester_tsdb_wal_truncate_duration_seconds
cortex_ingester_tsdb_head_active_appenders
cortex_ingester_tsdb_head_series_not_found_total
- `co...
Cortex 1.2.0-rc.1
RC1 has one bugfix over RC0: #2796
This release has a number of bug-fixes and enhancements, particularly:
- Memberlist KV client is no longer considered experimental. #2725
- 3rd-party index and chunk stores using gRPC client/server plugin mechanism (experimental) #2220
- Using an invalid flag no longer causes printing of all available flags. #2691 (my favourite change!)
Many thanks to all contributors.
Detailed list of changes:
- [CHANGE] Metric
cortex_kv_request_duration_seconds
now includesname
label to denote which client is being used as well as thebackend
label to denote the KV backend implementation in use. #2648 - [CHANGE] Experimental Ruler: Rule groups persisted to object storage using the experimental API have an updated object key encoding to better handle special characters. Rule groups previously-stored using object storage must be renamed to the new format. #2646
- [CHANGE] Query Frontend now uses Round Robin to choose a tenant queue to service next. #2553
- [CHANGE]
-promql.lookback-delta
is now deprecated and has been replaced by-querier.lookback-delta
along withlookback_delta
entry underquerier
in the config file.-promql.lookback-delta
will be removed in v1.4.0. #2604 - [CHANGE] Experimental TSDB: removed
-experimental.tsdb.bucket-store.binary-index-header-enabled
flag. Now the binary index-header is always enabled. - [CHANGE] Experimental TSDB: Renamed index-cache metrics to use original metric names from Thanos, as Cortex is not aggregating them in any way: #2627
cortex_<service>_blocks_index_cache_items_evicted_total
=>thanos_store_index_cache_items_evicted_total{name="index-cache"}
cortex_<service>_blocks_index_cache_items_added_total
=>thanos_store_index_cache_items_added_total{name="index-cache"}
cortex_<service>_blocks_index_cache_requests_total
=>thanos_store_index_cache_requests_total{name="index-cache"}
cortex_<service>_blocks_index_cache_items_overflowed_total
=>thanos_store_index_cache_items_overflowed_total{name="index-cache"}
cortex_<service>_blocks_index_cache_hits_total
=>thanos_store_index_cache_hits_total{name="index-cache"}
cortex_<service>_blocks_index_cache_items
=>thanos_store_index_cache_items{name="index-cache"}
cortex_<service>_blocks_index_cache_items_size_bytes
=>thanos_store_index_cache_items_size_bytes{name="index-cache"}
cortex_<service>_blocks_index_cache_total_size_bytes
=>thanos_store_index_cache_total_size_bytes{name="index-cache"}
cortex_<service>_blocks_index_cache_memcached_operations_total
=>thanos_memcached_operations_total{name="index-cache"}
cortex_<service>_blocks_index_cache_memcached_operation_failures_total
=>thanos_memcached_operation_failures_total{name="index-cache"}
cortex_<service>_blocks_index_cache_memcached_operation_duration_seconds
=>thanos_memcached_operation_duration_seconds{name="index-cache"}
cortex_<service>_blocks_index_cache_memcached_operation_skipped_total
=>thanos_memcached_operation_skipped_total{name="index-cache"}
- [CHANGE] Experimental TSDB: Renamed metrics in bucket stores: #2627
cortex_<service>_blocks_meta_syncs_total
=>cortex_blocks_meta_syncs_total{component="<service>"}
cortex_<service>_blocks_meta_sync_failures_total
=>cortex_blocks_meta_sync_failures_total{component="<service>"}
cortex_<service>_blocks_meta_sync_duration_seconds
=>cortex_blocks_meta_sync_duration_seconds{component="<service>"}
cortex_<service>_blocks_meta_sync_consistency_delay_seconds
=>cortex_blocks_meta_sync_consistency_delay_seconds{component="<service>"}
cortex_<service>_blocks_meta_synced
=>cortex_blocks_meta_synced{component="<service>"}
cortex_<service>_bucket_store_block_loads_total
=>cortex_bucket_store_block_loads_total{component="<service>"}
cortex_<service>_bucket_store_block_load_failures_total
=>cortex_bucket_store_block_load_failures_total{component="<service>"}
cortex_<service>_bucket_store_block_drops_total
=>cortex_bucket_store_block_drops_total{component="<service>"}
cortex_<service>_bucket_store_block_drop_failures_total
=>cortex_bucket_store_block_drop_failures_total{component="<service>"}
cortex_<service>_bucket_store_blocks_loaded
=>cortex_bucket_store_blocks_loaded{component="<service>"}
cortex_<service>_bucket_store_series_data_touched
=>cortex_bucket_store_series_data_touched{component="<service>"}
cortex_<service>_bucket_store_series_data_fetched
=>cortex_bucket_store_series_data_fetched{component="<service>"}
cortex_<service>_bucket_store_series_data_size_touched_bytes
=>cortex_bucket_store_series_data_size_touched_bytes{component="<service>"}
cortex_<service>_bucket_store_series_data_size_fetched_bytes
=>cortex_bucket_store_series_data_size_fetched_bytes{component="<service>"}
cortex_<service>_bucket_store_series_blocks_queried
=>cortex_bucket_store_series_blocks_queried{component="<service>"}
cortex_<service>_bucket_store_series_get_all_duration_seconds
=>cortex_bucket_store_series_get_all_duration_seconds{component="<service>"}
cortex_<service>_bucket_store_series_merge_duration_seconds
=>cortex_bucket_store_series_merge_duration_seconds{component="<service>"}
cortex_<service>_bucket_store_series_refetches_total
=>cortex_bucket_store_series_refetches_total{component="<service>"}
cortex_<service>_bucket_store_series_result_series
=>cortex_bucket_store_series_result_series{component="<service>"}
cortex_<service>_bucket_store_cached_postings_compressions_total
=>cortex_bucket_store_cached_postings_compressions_total{component="<service>"}
cortex_<service>_bucket_store_cached_postings_compression_errors_total
=>cortex_bucket_store_cached_postings_compression_errors_total{component="<service>"}
cortex_<service>_bucket_store_cached_postings_compression_time_seconds
=>cortex_bucket_store_cached_postings_compression_time_seconds{component="<service>"}
cortex_<service>_bucket_store_cached_postings_original_size_bytes_total
=>cortex_bucket_store_cached_postings_original_size_bytes_total{component="<service>"}
cortex_<service>_bucket_store_cached_postings_compressed_size_bytes_total
=>cortex_bucket_store_cached_postings_compressed_size_bytes_total{component="<service>"}
cortex_<service>_blocks_sync_seconds
=>cortex_bucket_stores_blocks_sync_seconds{component="<service>"}
cortex_<service>_blocks_last_successful_sync_timestamp_seconds
=>cortex_bucket_stores_blocks_last_successful_sync_timestamp_seconds{component="<service>"}
- [CHANGE] Available command-line flags are printed to stdout, and only when requested via
-help
. Using invalid flag no longer causes printing of all available flags. #2691 - [CHANGE] Experimental Memberlist ring: randomize gossip node names to avoid conflicts when running multiple clients on the same host, or reusing host names (eg. pods in statefulset). Node name randomization can be disabled by using
-memberlist.randomize-node-name=false
. #2715 - [CHANGE] Memberlist KV client is no longer considered experimental. #2725
- [CHANGE] Experimental Delete Series: Make delete request cancellation duration configurable. #2760
- [CHANGE] Removed
-store.fullsize-chunks
option which was undocumented and unused (it broke ingester hand-overs). #2656 - [CHANGE] Query with no metric name that has previously resulted in HTTP status code 500 now returns status code 422 instead. #2571
- [FEATURE] TLS config options added for GRPC clients in Querier (Query-frontend client & Ingester client), Ruler, Store Gateway, as well as HTTP client in Config store client. #2502
- [FEATURE] The flag
-frontend.max-cache-freshness
is now supported within the limits overrides, to specify per-tenant max cache freshness values. The corresponding YAML config parameter has been changed fromresults_cache.max_freshness
tolimits_config.max_cache_freshness
. The legacy YAML config parameter (results_cache.max_freshness
) will continue to be supported till Cortex releasev1.4.0
. #2609 - [FEATURE] Experimental gRPC Store: Added support to 3rd parties index and chunk stores using gRPC client/server plugin mechanism. #2220
- [FEATURE] Add
-cassandra.table-options
flag to customize table options of Cassandra when creating the index or chunk table. #2575 - [ENHANCEMENT] Propagate GOPROXY value when building
build-image
. This is to help the builders building the code in a Network where default Go proxy is not accessible (e.g. when behind some corporate VPN). #2741 - [ENHANCEMENT] Querier: Added metric
cortex_querier_request_duration_seconds
for all requests to the querier. #2708 - [ENHANCEMENT] Cortex is now built with Go 1.14. #2480 #2749 #2753
- [ENHANCEMENT] Experimental TSDB: added the following metrics to the ingester: #2580 #2583 #2589 #2654
cortex_ingester_tsdb_appender_add_duration_seconds
cortex_ingester_tsdb_appender_commit_duration_seconds
cortex_ingester_tsdb_refcache_purge_duration_seconds
cortex_ingester_tsdb_compactions_total
cortex_ingester_tsdb_compaction_duration_seconds
cortex_ingester_tsdb_wal_fsync_duration_seconds
cortex_ingester_tsdb_wal_page_flushes_total
cortex_ingester_tsdb_wal_completed_pages_total
cortex_ingester_tsdb_wal_truncations_failed_total
cortex_ingester_tsdb_wal_truncations_total
cortex_ingester_tsdb_wal_writes_failed_total
cortex_ingester_tsdb_checkpoint_deletions_failed_total
cortex_ingester_tsdb_checkpoint_deletions_total
cortex_ingester_tsdb_checkpoint_creations_failed_total
cortex_ingester_tsdb_checkpoint_creations_total
cortex_ingester_tsdb_wal_truncate_duration_seconds
cortex_ingester_tsdb_head_active_appenders
- `cortex_ingester_tsd...