Group | Description |
---|---|
General Settings | Common settings for the Operator. |
Nodes | Common configuration for Cassandra Nodes. |
Liveness checks | Parameters to configure the liveness of a Cassandra Node. When a Node fails liveness checks, it is automatically restarted by Kubernetes. |
Readiness checks | Parameters to configure the readiness of a Cassandra Node. When a Node fails readiness checks, it is automatically removed from the service by Kubernetes. |
Backup and Restore | Configuration related to backup and restore of the Cassandra Cluster. |
Restore | Options only required if a backup should be restored on installation. |
External Cluster Access | Allow access to the Cassandra Cluster from outside Kubernetes. |
Metrics Export | Metrics can be exported with the Prometheus Metrics Exporter. |
Recovery Controller | The Recovery Controller allows the Cluster to autoheal when a Kubernetes node fails. |
Repair | Options to repair a node in the cluster. |
Advanced Configuration | Advanced configuration that is only required for very advanced usecases. |
Advanced Nodes | Advanced configuration options for Cassandra nodes. These are not-commonly modifed settings. See https://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html and other Cassandra documentation for details on parameters. |
Security | Security related settings. |
Caches | Cache related settings. |
Racks and Datacenter Awareness | Options related to Rack Awareness and Multi-Datacenter options. |
Optimization Settings | Settings to fine-tune the Cassandra cluster and optimize the performance. |
Safety Thresholds | Warning and Failure Thresholds. |
Network | Network related settings. |
Timeouts | Timeouts. |
Java Virtual Machine Settings | Settings related to the Java JVM and the Garbage Collector. |
Directories | Directory configurations. These should usually not be changed, as they probably require changes to the used docker image. |
Common settings for the Operator.
Name | Description | Default |
---|---|---|
NODE_COUNT | The number of Cassandra nodes to create for the cluster. Should be an odd number of nodes and at least 3 to create a useful quorum. | 3 |
NODE_CPU_MC | CPU request for the Cassandra node containers. | 1000 |
NODE_CPU_LIMIT_MC | CPU limit for the Cassandra node containers. | 1000 |
NODE_MEM_MIB | Memory request for the Cassandra node containers. | 4096 |
NODE_MEM_LIMIT_MIB | Memory limit for the Cassandra node containers. | 4096 |
NODE_DISK_SIZE_GIB | Disk size (in GiB) for the Cassandra node containers. | 20 |
NODE_STORAGE_CLASS | The storage class to be used in volumeClaimTemplates. By default, it is not required and the default storage class is used. | |
NODE_DOCKER_IMAGE | Cassandra node Docker image. | mesosphere/cassandra:3.11.7-1.0.3 |
NODE_DOCKER_IMAGE_PULL_POLICY | Cassandra node Docker image pull policy. | Always |
POD_MANAGEMENT_POLICY | Parallel startup may decrease the startup time of big clusters but lead to failing pods in the beginning when two nodes try to join at the very same time. | OrderedReady |
OVERRIDE_CLUSTER_NAME | Override the name of the Cassandra cluster set by the operator. This shouldn't be explicit set, unless you know what you're doing. |
Common configuration for Cassandra Nodes.
Name | Description | Default |
---|---|---|
NODE_TOLERATIONS | A list of kubernetes tolerations to let pods get scheduled on tainted nodes. |
Parameters to configure the liveness of a Cassandra Node. When a Node fails liveness checks, it is automatically restarted by Kubernetes.
Name | Description | Default |
---|---|---|
NODE_LIVENESS_PROBE_INITIAL_DELAY_S | Number of seconds after the container has started before the liveness probe is initiated. | 15 |
NODE_LIVENESS_PROBE_PERIOD_S | How often (in seconds) to perform the liveness probe. | 20 |
NODE_LIVENESS_PROBE_TIMEOUT_S | How long (in seconds) to wait for a liveness probe to succeed. | 60 |
NODE_LIVENESS_PROBE_SUCCESS_THRESHOLD | Minimum consecutive successes for the liveness probe to be considered successful after having failed. | 1 |
NODE_LIVENESS_PROBE_FAILURE_THRESHOLD | When a pod starts and the liveness probe fails, failure_threshold attempts will be made before restarting the pod. |
3 |
Parameters to configure the readiness of a Cassandra Node. When a Node fails readiness checks, it is automatically removed from the service by Kubernetes.
Name | Description | Default |
---|---|---|
NODE_READINESS_PROBE_INITIAL_DELAY_S | Number of seconds after the container has started before the readiness probe is initiated. | 0 |
NODE_READINESS_PROBE_PERIOD_S | How often (in seconds) to perform the readiness probe. | 5 |
NODE_READINESS_PROBE_TIMEOUT_S | How long (in seconds) to wait for a readiness probe to succeed. | 60 |
NODE_READINESS_PROBE_SUCCESS_THRESHOLD | Minimum consecutive successes for the readiness probe to be considered successful after having failed. | 1 |
NODE_READINESS_PROBE_FAILURE_THRESHOLD | When a pod starts and the readiness probe fails, failure_threshold attempts will be made before marking the pod as 'unready'. |
3 |
Configuration related to backup and restore of the Cassandra Cluster.
Name | Description | Default |
---|---|---|
BACKUP_RESTORE_ENABLED | Global flag that enables the medusa sidecar for backups. | False |
BACKUP_TRIGGER | Trigger parameter to start a backup. Simply needs to be changed from the current value to start a backup. | |
BACKUP_AWS_CREDENTIALS_SECRET | If set, can be used to provide the access_key, secret_key and security_token with a secret. | |
BACKUP_AWS_S3_BUCKET_NAME | The name of the AWS S3 bucket to store the backups. | |
BACKUP_AWS_S3_STORAGE_PROVIDER | Should be one of the s3_* values from https://github.com/apache/libcloud/blob/trunk/libcloud/storage/types.py . | s3_us_west_oregon |
BACKUP_PREFIX | If a prefix is given, multiple different backups can be stored in the same S3 bucket. | |
BACKUP_MEDUSA_CPU_MC | CPU request for the Medusa backup containers. | 100 |
BACKUP_MEDUSA_CPU_LIMIT_MC | CPU limit for the Medusa backup containers. | 500 |
BACKUP_MEDUSA_MEM_MIB | Memory request for the Medusa backup containers. | 256 |
BACKUP_MEDUSA_MEM_LIMIT_MIB | Memory limit for the Medusa backup containers. | 512 |
BACKUP_MEDUSA_DOCKER_IMAGE | Medusa backup Docker image which is used to make backups. | mesosphere/kudo-cassandra-medusa:0.6.0-1.0.3 |
BACKUP_MEDUSA_DOCKER_IMAGE_PULL_POLICY | The Pull policy for the Medusa Docker Image. | Always |
BACKUP_NAME | The name of the backup to create or restore. |
Options only required if a backup should be restored on installation.
Name | Description | Default |
---|---|---|
RESTORE_FLAG | When this is true, all backup configuration must point to an existing backup which is restored as a new cluster. | False |
RESTORE_OLD_NAMESPACE | The namespace from the operator that was used to create the backup. | |
RESTORE_OLD_NAME | The instance name from the operator that was used to create the backup. |
Allow access to the Cassandra Cluster from outside Kubernetes.
Name | Description | Default |
---|---|---|
EXTERNAL_SERVICE | Needs to be true for either EXTERNAL_NATIVE_TRANSPORT or EXTERNAL_RPC to work. | False |
EXTERNAL_NATIVE_TRANSPORT | This exposes the Cassandra cluster via an external service so it can be accessed from outside the Kubernetes cluster. | False |
EXTERNAL_RPC | This exposes the Cassandra cluster via an external service so it can be accessed from outside the Kubernetes cluster. Works only if START_RPC is true. | False |
EXTERNAL_NATIVE_TRANSPORT_PORT | The external port to use for Cassandra native transport protocol. | 9042 |
EXTERNAL_RPC_PORT | The external port to use for Cassandra rpc protocol. | 9160 |
EXTERNAL_SERVICE_ANNOTATIONS | Custom annotations for the external service. |
Metrics can be exported with the Prometheus Metrics Exporter.
Name | Description | Default |
---|---|---|
PROMETHEUS_EXPORTER_ENABLED | A toggle to enable the prometheus metrics exporter. | False |
PROMETHEUS_EXPORTER_CUSTOM_CONFIG_CM_NAME | The properties present in this configmap will be appended to the prometheus configuration properties. | |
PROMETHEUS_EXPORTER_PORT | Prometheus exporter port. | 7200 |
PROMETHEUS_EXPORTER_CPU_MC | CPU request for the Prometheus exporter containers. | 500 |
PROMETHEUS_EXPORTER_CPU_LIMIT_MC | CPU limit for the Prometheus exporter containers. | 1000 |
PROMETHEUS_EXPORTER_MEM_MIB | Memory request for the Prometheus exporter containers. | 512 |
PROMETHEUS_EXPORTER_MEM_LIMIT_MIB | Memory limit for the Prometheus exporter containers. | 512 |
PROMETHEUS_EXPORTER_DOCKER_IMAGE | The docker image of the Prometheus exporter. | mesosphere/cassandra-prometheus-exporter:2.3.4-1.0.3 |
PROMETHEUS_EXPORTER_DOCKER_IMAGE_PULL_POLICY | Prometheus exporter Docker image pull policy. | Always |
The Recovery Controller allows the Cluster to autoheal when a Kubernetes node fails.
Name | Description | Default |
---|---|---|
RECOVERY_CONTROLLER | Needs to be true for automatic failure recovery and node eviction. | False |
RECOVERY_CONTROLLER_DOCKER_IMAGE | Docker image for the recovery controller. | mesosphere/kudo-cassandra-recovery:0.0.2-1.0.3 |
RECOVERY_CONTROLLER_DOCKER_IMAGE_PULL_POLICY | Recovery controller Docker image pull policy. | Always |
RECOVERY_CONTROLLER_CPU_MC | CPU request for the Recovery controller container. | 50 |
RECOVERY_CONTROLLER_CPU_LIMIT_MC | CPU limit for the Recovery controller container. | 200 |
RECOVERY_CONTROLLER_MEM_MIB | Memory request for the Recovery controller container. | 50 |
RECOVERY_CONTROLLER_MEM_LIMIT_MIB | Memory limit for the Recovery controller container. | 256 |
Options to repair a node in the cluster.
Name | Description | Default |
---|---|---|
REPAIR_POD | Name of the pod on which 'nodetool repair' should be run. |
Advanced configuration that is only required for very advanced usecases.
Name | Description | Default |
---|---|---|
BOOTSTRAP_TIMEOUT | Timeout for the bootstrap binary to join the cluster with the new IP. . | 12h30m |
SHUTDOWN_OLD_REACHABLE_NODE | When a node replace is done, try to connect to the old node and shut it down before starting up the old node. | False |
JOLOKIA_PORT | The internal port for the Jolokia Agent. This port is not exposed, but can be changed if it conflicts with another port. | 7777 |
CUSTOM_CASSANDRA_YAML_BASE64 | Base64-encoded Cassandra properties are appended to cassandra.yaml and overwrite the default values. | |
KUBECTL_VERSION | Version of 'bitnami/kubectl' image. This image is used for some functionality of the operator. | 1.18.4 |
Advanced configuration options for Cassandra nodes. These are not-commonly modifed settings. See https://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html and other Cassandra documentation for details on parameters.
Name | Description | Default |
---|---|---|
STORAGE_PORT | The port for inter-node communication. | 7000 |
SSL_STORAGE_PORT | The port for inter-node communication over SSL. | 7001 |
START_NATIVE_TRANSPORT | If true, CQL is enabled. | True |
NATIVE_TRANSPORT_PORT | The port for CQL communication. | 9042 |
NATIVE_TRANSPORT_MAX_THREADS | The maximum number of thread handling requests. | |
NATIVE_TRANSPORT_MAX_FRAME_SIZE_IN_MB | The maximum allowed size of a frame. If you're changing this parameter, you may want to adjust max_value_size_in_mb accordingly. This should be positive and less than 2048. | |
NATIVE_TRANSPORT_MAX_CONCURRENT_CONNECTIONS | The maximum number of concurrent client connections. Defaults to -1, meaning unlimited. | |
NATIVE_TRANSPORT_MAX_CONCURRENT_CONNECTIONS_PER_IP | The maximum number of concurrent client connections per source IP address. Defaults to -1, meaning unlimited. | |
RPC_PORT | The port for Thrift RPC communication. | 9160 |
JMX_PORT | The JMX port that will be used to interface with the Cassandra application. | 7199 |
RMI_PORT | The RMI port that will be used to interface with the Cassandra application when TRANSPORT_ENCRYPTION_ENABLED is set. | 7299 |
JMX_LOCAL_ONLY | If true, the JMX port will only be opened on localhost and not be available inside the Kubernetes cluster. | True |
START_RPC | If true, Thrift RPC is enabled. This is deprecated but may be necessary for legacy applications. | False |
RPC_SERVER_TYPE | Cassandra provides two options for the RPC server. sync and hsha performance is about the same, but hsha uses less memory. (https://docs.datastax.com/en/dse/5.1/dse-dev/datastax_enterprise/config/configCassandra_yaml.html#configCassandra_yaml__rpc_server_type). | sync |
RPC_KEEPALIVE | Enables or disables keepalive on client connections (RPC or native). | True |
RPC_MIN_THREADS | The minimum thread pool size for remote procedure calls. | |
RPC_MAX_THREADS | The maximum thread pool size for remote procedure calls. | |
RPC_SEND_BUFF_SIZE_IN_BYTES | The sending socket buffer size in bytes for remote procedure calls. | |
RPC_RECV_BUFF_SIZE_IN_BYTES | The receiving socket buffer size for remote procedure calls. | |
PARTITIONER | The partitioner used to distribute rows across the cluster. Murmur3Partitioner is the recommended setting. RandomPartitioner and ByteOrderedPartitioner are supported for legacy applications. | org.apache.cassandra.dht.Murmur3Partitioner |
SEED_PROVIDER_CLASS | The class within Cassandra that handles the seed logic. (https://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html#seed-provider) | org.apache.cassandra.locator.SimpleSeedProvider |
NUM_TOKENS | The number of tokens assigned to each node. | 256 |
HINTED_HANDOFF_ENABLED | If true, hinted handoff is enabled for the cluster. (https://cassandra.apache.org/doc/latest/operating/hints.html#hinted-handoff) | True |
MAX_HINT_WINDOW_IN_MS | The maximum amount of time, in ms, that hints are generated for an unresponsive node. | 10800000 |
HINTED_HANDOFF_THROTTLE_IN_KB | The maximum throttle per delivery thread in KBs per second. | 1024 |
MAX_HINTS_DELIVERY_THREADS | The maximum number of delivery threads for hinted handoff. | 2 |
BATCHLOG_REPLAY_THROTTLE_IN_KB | The total maximum throttle for replaying failed logged batches in KBs per second. | 1024 |
MAX_HINTS_FILE_SIZE_IN_MB | The maximum size of a single hints file in Mb. | 128 |
COMMITLOG_TOTAL_SPACE_IN_MB | The total size of the commit log in Mb. | |
AUTO_SNAPSHOT | Take a snapshot of the data before truncating a keyspace or dropping a table. | True |
MEMTABLE_HEAP_SPACE_IN_MB | The amount of on-heap memory allocated for memtables. | |
MEMTABLE_OFFHEAP_SPACE_IN_MB | The total amount of off-heap memory allocated for memtables. | |
STREAMING_KEEP_ALIVE_PERIOD_IN_SECS | Interval to send keep-alive messages. The stream session fails when a keep-alive message is not received for 2 keep-alive cycles. | |
PHI_CONVICT_THRESHOLD | The sensitivity of the failure detector on an exponential scale. | |
REQUEST_SCHEDULER | The scheduler to handle incoming client requests according to a defined policy. This scheduler is useful for throttling client requests in single clusters containing multiple keyspaces. | org.apache.cassandra.scheduler.NoScheduler |
INCREMENTAL_BACKUPS | Backs up data updated since the last snapshot was taken. When enabled, Cassandra creates a hard link to each SSTable flushed or streamed locally in a backups subdirectory of the keyspace data. | False |
SNAPSHOT_BEFORE_COMPACTION | Enables or disables taking a snapshot before each compaction. A snapshot is useful to back up data when there is a data format change. | False |
COMMIT_FAILURE_POLICY | Policy for commit disk failures. (https://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html#commit-failure-policy). | stop |
INDEX_SUMMARY_CAPACITY_IN_MB | Fixed memory pool size in MB for SSTable index summaries. | |
ENDPOINT_SNITCH | Set to a class that implements the IEndpointSnitch interface. Cassandra uses the snitch to locate nodes and route requests. | SimpleSnitch |
DISK_FAILURE_POLICY | The policy for how Cassandra responds to disk failure. (https://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html#disk-failure-policy). | stop |
ENABLE_USER_DEFINED_FUNCTIONS | User defined functions (UDFs) present a security risk, since they are executed on the server side. UDFs are executed in a sandbox to contain the execution of malicious code. | False |
ENABLE_SCRIPTED_USER_DEFINED_FUNCTIONS | Java UDFs are always enabled, if enable_user_defined_functions is true. Enable this option to use UDFs with language javascript or any custom JSR-223 provider. This option has no effect if enable_user_defined_functions is false. | False |
ENABLE_MATERIALIZED_VIEWS | Enables materialized view creation on this node. Materialized views are considered experimental and are not recommended for production use. | False |
CDC_ENABLED | Enable / disable CDC functionality on a per-node basis. This modifies the logic used for write path allocation rejection. | False |
CDC_TOTAL_SPACE_IN_MB | Total space to use for change-data-capture (CDC) logs on disk. . | |
CDC_FREE_SPACE_CHECK_INTERVAL_MS | Interval between checks for new available space for CDC-tracked tables when the cdc_total_space_in_mb threshold is reached and the CDCCompactor is running behind or experiencing back pressure. | |
COLUMN_INDEX_SIZE_IN_KB | The granularity of the index of rows within a partition. For huge rows, decrease this setting to improve seek time. If you use key cache, be careful not to make this setting too large because key cache will be overwhelmed. | 64 |
ALLOCATE_TOKENS_FOR_KEYSPACE | Triggers automatic allocation of num_tokens tokens for this node. The allocation algorithm attempts to choose tokens in a way that optimizes replicated load over the nodes in the datacenter for the replication strategy used by the specified keyspace. | |
REPAIR_SESSION_MAX_TREE_DEPTH | Limits the maximum Merkle tree depth to avoid consuming too much memory during repairs. | |
ENABLE_SASI_INDEXES | Enables SASI index creation on this node. SASI indexes are considered experimental and are not recommended for production use. | |
JVM_OPT_JOIN_RING | Set to false to start Cassandra on a node but not have the node join the cluster. | |
JVM_OPT_LOAD_RING_STATE | Set to false to clear all gossip state for the node on restart. Use when you have changed node information in cassandra.yaml (such as listen_address). | |
JVM_OPT_REPLAYLIST | Allow restoring specific tables from an archived commit log. | |
JVM_OPT_RING_DELAY_MS | Allows overriding of the default RING_DELAY (30000ms), which is the amount of time a node waits before joining the ring. | |
JVM_OPT_WRITE_SURVEY | For testing new compaction and compression strategies. It allows you to experiment with different strategies and benchmark write performance differences without affecting the production workload. | |
JVM_OPT_FORCE_DEFAULT_INDEXING_PAGE_SIZE | To disable dynamic calculation of the page size used when indexing an entire partition (during initial index build/rebuild). If set to true, the page size will be fixed to the default of 10000 rows per page. | |
JVM_OPT_EXPIRATION_DATE_OVERFLOW_POLICY | Defines how to handle INSERT requests with TTL exceeding the maximum supported expiration date. (https://docs.datastax.com/en/dse/6.0/dse-dev/datastax_enterprise/config/cassandraSystemProperties.html) |
Security related settings.
Name | Description | Default |
---|---|---|
TRANSPORT_ENCRYPTION_ENABLED | Enable node-to-node encryption. | False |
TRANSPORT_ENCRYPTION_CLIENT_ENABLED | Enable client-to-node encryption. | False |
TRANSPORT_ENCRYPTION_CIPHERS | Comma-separated list of JSSE Cipher Suite Names. Might require changes to the installed Java Runtime. | TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA |
TRANSPORT_ENCRYPTION_CLIENT_ALLOW_PLAINTEXT | Enable Server-Client plaintext communication alongside encrypted traffic. | False |
TRANSPORT_ENCRYPTION_REQUIRE_CLIENT_AUTH | Enable client certificate authentication on node-to-node transport encryption. | True |
TRANSPORT_ENCRYPTION_CLIENT_REQUIRE_CLIENT_AUTH | Enable client certificate authentication on client-to-node transport encryption. | True |
TLS_SECRET_NAME | The TLS secret that contains the self-signed certificate (cassandra.crt) and the private key (cassandra.key). The secret will be mounted as a volume to make the artifacts available. | cassandra-tls |
AUTHENTICATOR | Authentication backend, implementing IAuthenticator; used to identify users. (https://cassandra.apache.org/doc/latest/operating/security.html#authentication) | AllowAllAuthenticator |
AUTHENTICATION_SECRET_NAME | The secret must contain the credentials used by the operator when running 'nodetool' for its functionality. Only relevant if AUTHENTICATOR is set to 'PasswordAuthenticator'. The secret needs to have a 'username' and a 'password' entry. | |
AUTHORIZER | Authorization backend, implementing IAuthorizer; used to limit access/provide permissions. (https://cassandra.apache.org/doc/latest/operating/security.html#authorization) | AllowAllAuthorizer |
ROLE_MANAGER | Part of the Authentication & Authorization backend that implements IRoleManager to maintain grants and memberships between roles, By default, the value set is Apache Cassandra's out of the box Role Manager: CassandraRoleManager. (https://cassandra.apache.org/doc/latest/operating/security.html#roles) | CassandraRoleManager |
ROLES_VALIDITY_IN_MS | Validity period for roles cache; set to 0 to disable. | 2000 |
ROLES_UPDATE_INTERVAL_IN_MS | After this interval, cache entries become eligible for refresh. Upon next access, Cassandra schedules an async reload, and returns the old value until the reload completes. If roles_validity_in_ms is non-zero, then this must be also. | |
CREDENTIALS_VALIDITY_IN_MS | This cache is tightly coupled to the provided PasswordAuthenticator implementation of IAuthenticator. If another IAuthenticator implementation is configured, Cassandra does not use this cache, and these settings have no effect. Set to 0 to disable. | 2000 |
CREDENTIALS_UPDATE_INTERVAL_IN_MS | After this interval, cache entries become eligible for refresh. The next time the cache is accessed, the system schedules an asynchronous reload of the cache. Until this cache reload is complete, the cache returns the old values. If credentials_validity_in_ms is nonzero, this property must also be nonzero. | |
PERMISSIONS_VALIDITY_IN_MS | How many milliseconds permissions in cache remain valid. Fetching permissions can be resource intensive. To disable the cache, set this to 0. | 2000 |
PERMISSIONS_UPDATE_INTERVAL_IN_MS | If enabled, sets refresh interval for the permissions cache. After this interval, cache entries become eligible for refresh. On next access, Cassandra schedules an async reload and returns the old value until the reload completes. If permissions_validity_in_ms is nonzero, permissions_update_interval_in_ms must also be non-zero. | |
INTERNODE_AUTHENTICATOR | The internode authentication backend. | |
JVM_OPT_DISABLE_AUTH_CACHES_REMOTE_CONFIGURATION | To disable configuration via JMX of auth caches (such as those for credentials, permissions and roles). This will mean those config options can only be set (persistently) in cassandra.yaml and will require a restart for new values to take effect. |
Cache related settings.
Name | Description | Default |
---|---|---|
KEY_CACHE_SIZE_IN_MB | A global cache setting for the maximum size of the key cache in memory (for all tables). . | |
KEY_CACHE_SAVE_PERIOD | The duration in seconds that keys are saved in cache. Saved caches greatly improve cold-start speeds and has relatively little effect on I/O. | 14400 |
KEY_CACHE_KEYS_TO_SAVE | The number of keys from the key cache to save. | |
COUNTER_CACHE_SIZE_IN_MB | Maximum size of the counter cache in memory. When no value is set, Cassandra uses the smaller of minimum of 2.5% of Heap or 50MB. | |
COUNTER_CACHE_SAVE_PERIOD | The amount of time after which Cassandra saves the counter cache (keys only). | 7200 |
COUNTER_CACHE_KEYS_TO_SAVE | The number of keys from the counter cache to save. | |
ROW_CACHE_SIZE_IN_MB | Maximum size of the row cache in memory. Row cache can save more time than key_cache_size_in_mb, but is space-intensive because it contains the entire row. Use the row cache only for hot rows or static rows. 0 disables the row cache. | 0 |
ROW_CACHE_SAVE_PERIOD | Duration in seconds that rows are saved in cache. 0 disables saving . | 0 |
ROW_CACHE_KEYS_TO_SAVE | The number of keys from the row cache to save. | |
FILE_CACHE_SIZE_IN_MB | The total memory to use for SSTable-reading buffers. Defaults to the smaller of 1/4 of heap or 512MB. | |
ROW_CACHE_CLASS_NAME | Row cache implementation class name. Can be 'org.apache.cassandra.cache.OHCProvider' (default) or 'org.apache.cassandra.cache.SerializingCacheProvider'. | |
BUFFER_POOL_USE_HEAP_IF_EXHAUSTED | Allocate on-heap memory when the SSTable buffer pool is exhausted. | |
PREPARED_STATEMENTS_CACHE_SIZE_MB | Maximum size of the native protocol prepared statement cache. | |
THRIFT_PREPARED_STATEMENTS_CACHE_SIZE_MB | Maximum size of the Thrift prepared statement cache. Leave empty if you do not use Thrift/RPC. | |
COLUMN_INDEX_CACHE_SIZE_IN_KB | A threshold for the total size of all index entries for a partition that the database stores in the partition key cache. | 2 |
Options related to Rack Awareness and Multi-Datacenter options.
Name | Description | Default |
---|---|---|
NODE_TOPOLOGY | This describes a multi-datacenter setup. When set it has precedence over NODE_COUNT. See docs/multidatacenter.md for more details. | |
SERVICE_ACCOUNT_INSTALL | If true, the operator automatically installs a cluster role, service account and role binding. This is required for advanced functionality of multi-datacenter setups. | False |
NODE_ANTI_AFFINITY | Ensure that every Cassandra node is deployed on a separate Kubernetes node. (https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#inter-pod-affinity-and-anti-affinity) | False |
EXTERNAL_SEED_NODES | List of seed nodes external to this instance to add to the cluster. This allows clusters spanning multiple Kubernetes clusters. | |
INTER_DC_TCP_NODELAY | Enable or disable tcp_nodelay for inter-dc communication. Disabling it will result in larger (but fewer) network packets being sent, reducing overhead from the TCP protocol itself, at the cost of increasing latency if you block for cross-datacenter responses. |
Settings to fine-tune the Cassandra cluster and optimize the performance.
Name | Description | Default |
---|---|---|
COMMITLOG_SYNC | The method that Cassandra uses to acknowledge writes in milliseconds. | periodic |
COMMITLOG_SYNC_PERIOD_IN_MS | The number of milliseconds between disk fsync calls. | 5000 |
COMMITLOG_SYNC_BATCH_WINDOW_IN_MS | Time to wait between batch fsyncs, if commitlog_sync is in batch mode then default value should be: 2. | |
COMMITLOG_SEGMENT_SIZE_IN_MB | The size of each commit log segment in Mb. | 32 |
CONCURRENT_READS | For workloads with more data than can fit in memory, the bottleneck is reads fetching data from disk. Setting to (16 times the number of drives) allows operations to queue low enough in the stack so that the OS and drives can reorder them. | 16 |
CONCURRENT_WRITES | Writes in Cassandra are rarely I/O bound, so the ideal number of concurrent writes depends on the number of CPU cores in your system. The recommended value is 8 times the number of cpu cores. | 32 |
CONCURRENT_COUNTER_WRITES | Counter writes read the current values before incrementing and writing them back. The recommended value is (16 times the number of drives) . | 16 |
CONCURRENT_MATERIALIZED_VIEW_WRITES | The maximum number of concurrent writes to materialized views. | 32 |
MEMTABLE_ALLOCATION_TYPE | The type of allocations for the Cassandra memtable. heap_buffers keep all data on the JVM heap. offheap_buffers may reduce heap utilization for large string or binary values. offheap_objects may improve heap size for small integers or UUIDs as well. Both off heap options will increase read latency. (https://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html#memtable-allocation-type ). | heap_buffers |
INDEX_SUMMARY_RESIZE_INTERVAL_IN_MINUTES | How frequently index summaries should be re-sampled in minutes. This is done periodically to redistribute memory from the fixed-size pool to SSTables proportional their recent read rates. | 60 |
TRICKLE_FSYNC | When set to true, causes fsync to force the operating system to flush the dirty buffers at the set interval. | false |
TRICKLE_FSYNC_INTERVAL_IN_KB | The size of the fsync in kilobytes. | 10240 |
COMPACTION_THROUGHPUT_MB_PER_SEC | Throttles compaction to the specified total throughput across the node. Compaction frequency varies with direct proportion to write throughput and is necessary to limit the SSTable size. The recommended value is 16 to 32 times the rate of write throughput (in MB/second). | 64 |
SSTABLE_PREEMPTIVE_OPEN_INTERVAL_IN_MB | When compacting, the replacement opens SSTables before they are completely written and uses in place of the prior SSTables for any range previously written. This setting helps to smoothly transfer reads between the SSTables by reducing page cache churn and keeps hot rows hot. | 50 |
DYNAMIC_SNITCH_UPDATE_INTERVAL_IN_MS | The time, in ms, the snitch will wait before updating node scores. | 100 |
DYNAMIC_SNITCH_RESET_INTERVAL_IN_MS | The time, in ms, the snitch will wait before resetting node scores allowing bad nodes to recover. | 600000 |
DYNAMIC_SNITCH_BADNESS_THRESHOLD | Sets the performance threshold for dynamically routing client requests away from a poorly performing node. | 0.1 |
HINTS_FLUSH_PERIOD_IN_MS | The time, in ms, for the period in which hints are flushed to disk. | 10000 |
MEMTABLE_CLEANUP_THRESHOLD | The ratio used for automatic memtable flush. | |
MEMTABLE_FLUSH_WRITERS | The number of memtable flush writer threads. | |
CONCURRENT_COMPACTORS | The number of concurrent compaction processes allowed to run simultaneously on a node. | |
DISK_OPTIMIZATION_STRATEGY | The strategy for optimizing disk reads. (https://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html#disk-optimization-strategy). | |
TRACETYPE_QUERY_TTL | TTL for different trace types used during logging of the query process. | |
TRACETYPE_REPAIR_TTL | TTL for different trace types used during logging of the repair process. | |
JVM_OPT_AVAILABLE_PROCESSORS | In a multi-instance deployment, multiple Cassandra instances will independently assume that all CPU processors are available to it. This setting allows you to specify a smaller set of processors and perhaps have affinity. |
Warning and Failure Thresholds.
Name | Description | Default |
---|---|---|
TOMBSTONE_WARN_THRESHOLD | The maximum number of tombstones a query can scan before warning. | 1000 |
TOMBSTONE_FAILURE_THRESHOLD | The maximum number of tombstones a query can scan before aborting. | 100000 |
BATCH_SIZE_WARN_THRESHOLD_IN_KB | Warn the operator on a batch size exceeding this value in kilobytes. Caution should be taken on increasing the size of this threshold as it can lead to node instability. | 5 |
BATCH_SIZE_FAIL_THRESHOLD_IN_KB | Fail batch sizes exceeding this value in kilobytes. Caution should be taken on increasing the size of this threshold as it can lead to node instability. | 50 |
UNLOGGED_BATCH_ACROSS_PARTITIONS_WARN_THRESHOLD | Causes Cassandra to log a WARN message on any batches not of type LOGGED that span across more partitions than this limit. | 10 |
COMPACTION_LARGE_PARTITION_WARNING_THRESHOLD_MB | Cassandra logs a warning when compacting partitions larger than the set value. | 100 |
GC_WARN_THRESHOLD_IN_MS | Any GC pause longer than this interval is logged at the WARN level. | 1000 |
GC_LOG_THRESHOLD_IN_MS | GC Pauses greater than this interval will be logged at INFO level. This threshold can be adjusted to minimize logging if necessary. | 200 |
MAX_VALUE_SIZE_IN_MB | The maximum size of any value in SSTables. | |
SLOW_QUERY_LOG_TIMEOUT_IN_MS | How long before a node logs slow queries. Select queries that exceed this value generate an aggregated log message to identify slow queries. To disable, set to 0. | 500 |
Network related settings.
Name | Description | Default |
---|---|---|
THRIFT_FRAMED_TRANSPORT_SIZE_IN_MB | Frame size (maximum field length) for Thrift. | 15 |
INTERNODE_SEND_BUFF_SIZE_IN_BYTES | Set socket buffer size for internode communication Note that when setting this, the buffer size is limited by net.core.wmem_max and when not setting it it is defined by net.ipv4.tcp_wm. | |
INTERNODE_RECV_BUFF_SIZE_IN_BYTES | Set socket buffer size for internode communication Note that when setting this, the buffer size is limited by net.core.wmem_max and when not setting it it is defined by net.ipv4.tcp_wmem. | |
INTERNODE_COMPRESSION | Controls whether traffic between nodes is compressed. all compresses all traffic. none compresses no traffic. dc compresses between datacenters. | dc |
OTC_COALESCING_STRATEGY | The strategy to use for coalescing network messages. (https://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html#otc-coalescing-strategy). | |
OTC_COALESCING_WINDOW_US | How many microseconds to wait for coalescing. | |
OTC_COALESCING_ENOUGH_COALESCED_MESSAGES | Do not try to coalesce messages if we already got that many messages. This should be more than 2 and less than 128. | |
OTC_BACKLOG_EXPIRATION_INTERVAL_MS | How many milliseconds to wait between two expiration runs on the backlog (queue) of the OutboundTcpConnection. | |
LISTEN_ON_BROADCAST_ADDRESS | Listen on the address set in broadcast_address property. | |
STREAM_THROUGHPUT_OUTBOUND_MEGABITS_PER_SEC | The maximum throughput of all outbound streaming file transfers on a node. | |
INTER_DC_STREAM_THROUGHPUT_OUTBOUND_MEGABITS_PER_SEC | The maximum throughput of all streaming file transfers between datacenters. | |
JVM_OPT_PREFER_IPV4_STACK | Prefer binding to IPv4 network intefaces (when net.ipv6.bindv6only=1). See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6342561 (short version: comment out this entry to enable IPv6 support). | True |
Timeouts.
Name | Description | Default |
---|---|---|
READ_REQUEST_TIMEOUT_IN_MS | The time that the coordinator waits for read operations to complete in ms. | 5000 |
RANGE_REQUEST_TIMEOUT_IN_MS | The time that the coordinator waits for range scans complete in ms. | 10000 |
WRITE_REQUEST_TIMEOUT_IN_MS | The time that the coordinator waits for write operations to complete in ms. | 2000 |
COUNTER_WRITE_REQUEST_TIMEOUT_IN_MS | The time that the coordinator waits for counter write operations to complete in ms. | 5000 |
CAS_CONTENTION_TIMEOUT_IN_MS | The time for which the coordinator will retry CAS operations on the same row in ms. | 1000 |
TRUNCATE_REQUEST_TIMEOUT_IN_MS | The time that the coordinator waits for truncate operations to complete in ms. | 60000 |
REQUEST_TIMEOUT_IN_MS | The default timeout for all other requests in ms. | 10000 |
CROSS_NODE_TIMEOUT | Operation timeout information exchange between nodes (to accurately measure request timeouts). | False |
Settings related to the Java JVM and the Garbage Collector.
Name | Description | Default |
---|---|---|
NODE_MIN_HEAP_SIZE_MB | The minimum JVM heap size in MB. This has a smart default and doesn't need to be explicitly set. | |
NODE_MAX_HEAP_SIZE_MB | The maximum JVM heap size in MB. This has a smart default and doesn't need to be explicitly set. | |
NODE_NEW_GENERATION_HEAP_SIZE_MB | The JVM new generation heap size in MB. | |
JVM_OPT_THREAD_PRIORITY_POLICY | Allows lowering thread priority without being root on linux - probably not necessary on Windows but doesn't harm anything. | 0 |
JVM_OPT_THREAD_STACK_SIZE | Per-thread stack size. | 256k |
JVM_OPT_STRING_TABLE_SIZE | Larger interned string table, for gossip's benefit (CASSANDRA-6410). | 1000003 |
JVM_OPT_SURVIVOR_RATIO | The Ratio between eden and survivor space parts of the heap. A value of 6 sets the ratio between eden and a survivor space to 1:6. In other words, each survivor space will be one-sixth the size of eden, and thus one-eighth the size of the young generation (not one-seventh, because there are two survivor spaces). | |
JVM_OPT_MAX_TENURING_THRESHOLD | The number of times an object is moved between eden spaces before it is promoted to the survivor (old-gen) space. | |
JVM_OPT_CMS_INITIATING_OCCUPANCY_FRACTION | The amount of heap that needs to be full before a CMS GC cycle is triggered. | |
JVM_OPT_USE_CMS_INITIATING_OCCUPANCY_ONLY | If set to true, the GC will not autodetect when to trigger a GC cycle, only the defined CMSInitiatingOccupancyFraction. | |
JVM_OPT_CMS_WAIT_DURATION | Once CMS collection is triggered, it will wait for next young collection to perform initial mark right after. This parameter specifies how long CMS can wait for young collection. | |
JVM_OPT_PARALLEL_GC_THREADS | For systems with > 8 cores, the default ParallelGCThreads is 5/8 the number of logical cores. Otherwise equal to the number of cores when 8 or less. Machines with > 10 cores should try setting these to <= full cores. | |
JVM_OPT_CONC_GC_THREADS | By default, ConcGCThreads is 1/4 of ParallelGCThreads. Setting both to the same value can reduce STW durations. | |
JVM_OPT_NUMBER_OF_GC_LOG_FILES | GC logging options: NumberOfGCLogFiles. | 10 |
JVM_OPT_GC_LOG_FILE_SIZE | GC logging options: GCLOGFILESIZE. | 10M |
JVM_OPT_GC_LOG_DIRECTORY | GC logging options: GC_LOG_DIRECTORY. | |
JVM_OPT_PRINT_FLS_STATISTICS | GC logging options: PrintFLSStatistics. | |
CUSTOM_JVM_OPTIONS_BASE64 | Base64-encoded JVM options are appended to the default jvm.options and overwrite existing values there. |
Directory configurations. These should usually not be changed, as they probably require changes to the used docker image.
Name | Description | Default |
---|---|---|
HINTS_DIRECTORY | Directory where Cassandra should store hints. | |
COMMITLOG_DIRECTORY | When running on magnetic HDD, this should be a separate spindle than the data directories. If not set, the default directory is $CASSANDRA_HOME/data/commitlog. | |
CDC_RAW_DIRECTORY | CommitLogSegments are moved to this directory on flush if cdc_enabled: true and the segment contains mutations for a CDC-enabled table. | |
SAVED_CACHES_DIRECTORY | Saved caches. If not set, the default directory is $CASSANDRA_HOME/data/saved_caches. | |
JVM_OPT_TRIGGERS_DIR | Set the default location for the trigger JARs. (Default: conf/triggers). |
All parameters that are not assigned to a specific group.
Name | Description | Default |
---|---|---|
PERMISSIONS_CACHE_MAX_ENTRIES | The maximum number of entries that are held by the standard authentication cache and row-level access control (RLAC) cache. | 1000 |