Inter-cluster network bandwidth throttling
Cluster throttlers limit incoming inter-cluster traffic on YTsaurus MR clusters. Throttling applies to RemoteCopy operations and MapReduce operations that read input tables from remote clusters.
For more information about how throttling affects user operations, see Inter-cluster Network Bandwidth Throttling.
Distributed throttler mechanism
Each exec node runs a distributed throttler factory. Each throttler limits the incoming data stream from a specific remote cluster. For example, the bandwidth_remote throttler on the local cluster limits the incoming stream from the remote cluster.
A leader is elected among all exec nodes using a gossip-like protocol. The leader:
- Collects throttler state from all
execnodes. - Tracks the total inter-cluster bandwidth usage.
- Distributes individual limits to
execnodes. - Sends controller agents information about inter-cluster bandwidth availability.
If the leader becomes unavailable, a new leader is elected using the gossip-like protocol.
Configuration
The cluster throttlers configuration is stored in the //sys/cluster_throttlers node of the cluster metadata tree. To limit the remote read rate from a specific external cluster, add that cluster to the configuration.
Configuration fields
enabled— enables or disables throttling. When set to%false, throttlers are not applied.update_period— interval (in ms) for polling the//sys/cluster_throttlersnode to update the configuration.distributed_throttler— distributed throttler settings:limit_update_period— how often (in ms) anexecnode sends its state to the leader and receives its local quota.leader_update_period— how often (in ms) anexecnode updates information about the current leader.local_throttlers_attribute_update_period— how often (in ms) thelocal_throttlersattribute in the discovery service is updated (used for introspection).throttler_expiration_time— time (in ms) after which the leader considers a throttler inactive if no updates have been received from it.
cluster_limits— quotas for each remote cluster, keyed by cluster name:bandwidth— incoming bandwidth limit:limit— maximum throughput in bytes per second.
rps— read request rate limit:limit— maximum number of requests per second.
Configuration example
{
"enabled" = %true;
"update_period" = 5000;
"distributed_throttler" = {
"leader_update_period" = 5000;
"throttler_expiration_time" = 60000;
"limit_update_period" = 1000;
"local_throttlers_attribute_update_period" = 5000;
};
"cluster_limits" = {
"remote_1" = {
"bandwidth" = {
"limit" = 549755813888; // 512 GB/s
};
};
"remote_2" = {
"bandwidth" = {
"limit" = 4294967296; // 4 GB/s
};
};
};
}
Managing configuration with the CLI
To manage the cluster throttlers configuration using the CLI:
- Create a configuration file with the required settings (see the example above).
- Create the
//sys/cluster_throttlersnode in the cluster metadata tree:yt create document //sys/cluster_throttlers - Write the configuration to the created node:
yt set //sys/cluster_throttlers < config_file
Introspection
Throttler state on nodes
Throttler state for exec nodes is published under the remote_cluster_throttlers_group group in the discovery service. Each node publishes a local_throttlers attribute with the following fields:
rate— current throughput in bytes per second.limit— quota in bytes per second assigned to this node by the leader.queue_byte_size— queue size in bytes.quota_exceeded— whether the quota has been exceeded.period— limit update period in milliseconds.
To view throttler state on the local cluster, run the following command on that cluster:
yt --proxy local list \
"//sys/discovery_servers/<discovery-server>/orchid/discovery_server/remote_cluster_throttlers_group/@members" \
--attribute local_throttlers --format json | jq -r '.[] | .["$attributes"]["local_throttlers"]'
The command outputs throttler state for all exec nodes in the cluster:
{
"bandwidth_remote_1": {
"rate": 19324406.5,
"limit": 21496801.3,
"queue_byte_size": 0,
"quota_exceeded": false,
"period": 1000
},
"bandwidth_remote_2": {
"rate": 1832910.1,
"limit": 1935680.6,
"queue_byte_size": 0,
"quota_exceeded": false,
"period": 1000
}
}
To view utilization of the network channel from the remote cluster to the local cluster, run the following command on the local cluster:
yt --proxy local list \
"//sys/discovery_servers/<discovery-server>/orchid/discovery_server/remote_cluster_throttlers_group/@members" \
--attribute local_throttlers --format json | jq '[.[] | ."$attributes"."local_throttlers"."bandwidth_remote"."rate"] | add'
Verifying the leader
To verify that a single leader has been elected in the throttling group, run:
yt list \
"//sys/discovery_servers/<discovery-server>/orchid/discovery_server/remote_cluster_throttlers_group/@members" \
--attribute leader_id --attribute address --format json \
| jq -r '.[] | [.["$attributes"]["leader_id"], .["$attributes"]["address"]] | @tsv' \
| cut -f 1 | sort -u
If the command outputs more than one unique leader_id, this indicates a split-brain condition. Investigate the root cause, for example, network isolation of some nodes.