Configuring server component logging

Updated at July 2, 2025

The server components of the YTsaurus cluster generate detailed logs that can be used to audit and analyze problems during operation. For production installations, we recommend allocating dedicated storage locations on persistent volumes for these logs. Absence of logs can significantly complicate support.

You can use Prometheus metrics with the yt_logging_* prefix to analyze the logging subsystem.

Debugging logs

Debugging logs are described in the loggers section of the YTsaurus component specification.

Table 1 — YTsaurus debug logger settings

Field	Possible values	Description
`name`	arbitrary string	The logger name (we recommend choosing short and clear names like `debug` or `info`).
`format`	`plain_text` (default), `yson`, `json`	The format of the log string.
`minLogLevel`	`trace`, `debug`, `info`, `error`	The minimum level for records that reach the log.
`categoriesFilter`		A filter that only lets you write logs from some subsystems (see below).
`writerType`	`file`, `stderr`	Write logs to a file or stderr. When writing to stderr, the rotation settings are ignored.
`compression`	`none` (default), `gzip`, `zstd`	If a value other than `none` is set, the YTsaurus server will write compressed logs.
`useTimestampSuffix`	`true`, `false` (default)	If `true`, a timestamp is added to the file name when it's opened or on rotation. At the same time, the numbering mechanism doesn't apply to old segments. This option is only relevant when writing to a file.
`rotationPolicy`		Log rotation settings (see below). This option is only relevant when writing to a file.

The path to the directory for logs with writerType=file is set in the Logs type location description. If no Logs location is specified, they are written to /var/log.

Log file names follow the format [component].[name].log(.[format])(.[compression])(.[timestamp_suffix]). Examples:

controller-agent.error.log
master.debug.log.gzip
scheduler.info.log.json.zstd.2023-01-01T10:30:00

Debug log entries contain the following fields:

instant — the time in the local time zone
level — write level: T — trace, D — debug, I — info, W — warning, E — error
category — the name of the subsystem the recording belongs to (for example, ChunkClient, ObjectServer, or RpcServer)
message — message body
thread_id — the ID or name of the thread that generated the entry (only written as plain_text)
fiber_id — the ID of the fiber that generated the record (only written as plain_text)
trace_id — the trace_context ID the recording appeared for (only written as plain_text)

Sample entry

2023-09-15 00:00:17,215385      I       ExecNode        Artifacts prepared (JobId: d15d7d5f-164ff08a-3fe0384-128e0, OperationId: cd56ab80-d21ef5ab-3fe03e8-d05edd49, JobType: Map)      Job     fff6d4149ccdf656    2bd5c3c9-600a44f5-de721d58-fb905017

Recommendations for configuring categories

There are two types of category filters (categoriesFilter):

inclusive — records are only written for categories that were explicitly listed
exclusive — records are written for any categories except those that are listed

In large installations, you often need to exclude the Bus and Concurrency categories.

Sample filters

categoriesFilter:
  type: exclude
  values: ["Bus", "Concurrency"]

categoriesFilter:
  type: include
  values: ["Scheduler", "Strategy"]

Structured logs

Some YTsaurus components can generate structured logs, which you can later use for auditing, analytics, and automatic processing. Structured logs are described in the structured_loggers section of the YTsaurus component specification.

Structured loggers are described using the same fields as debugging logs, except:

writerType — not set (structured logs are always written to a file)
categoriesFilter — the required category field is set instead and is equal to one category

Structured logs should always be written in a structured format: JSON or YSON. Events in a structured log are usually recorded at the info level. The set of structured log fields varies depending on the specific log type.

The main types of structured logs:

master_access_log — data access log (written on the master, Access category)
master_security_log — log of security events like adding a user to a group or modifying an ACL (written on the master, SecurityServer category)
structured_http_proxy_log — log of requests to http proxy, one line per request (written on http proxy, HttpStructuredProxy category)
chyt_log — log of requests to CHYT, one line per request (written on http proxy, ClickHouseProxyStructured catgeory)
structured_rpc_proxy_log — log of requests to rpc proxy, one line per request (written on rpc proxy, RpcProxyStructuredMain category)
scheduler_event_log — scheduler event log, written by the scheduler (SchedulerEventLog category)
controller_event_log — log of controller agent events, written on the controller agent (ControllerEventLog category)

Table access log

Sometimes, you may need to know who's using a particular table, for example, to evaluate the consequences of it being moved or deleted. This might be difficult if the table is accessed by links.

For this, there is a special log that records events involving Cypress nodes that might be of interest to users.

Logs are written by master servers. Due to technical reasons, several servers produce the same sequence of entries, so duplication is to be expected. On top of that, some actions (for example, writing to a table) are represented as a sequence of several different events on different master servers (from different shards). This is covered in more detailed below.

Each log entry (table row) contains one command applied to a certain Cypress node.

Only the commands applied to the following types of nodes are recorded:

Table
File
Document
Journal

It must be noted that directories are not included in this list.

The following commands are recorded:

Basic (CRUD):
- Create
- Get
- GetKey
- Exists
- List
- Set
- Remove
Creating a symbolic link:
- Link
Locking:
- Lock
- Unlock
Copying and moving:
- Copy
- Move
- BeginCopy, EndCopy
Reading and writing data:
- GetBasicAttributes
- Fetch
- BeginUpload
- EndUpload
Changing the state of a dynamic table:
- PrepareMount, CommitMount, AbortMount
- PrepareUnmount, CommitUnmount, AbortUnmount
- PrepareRemount, CommitRemount, AbortRemount
- PrepareFreeze, CommitFreeze, AbortFreeze
- PrepareUnfreeze, CommitUnfreeze, AbortUnfreeze
- PrepareReshard, CommitReshard, AbortReshard
Other:
- CheckPermission

Below you can find some comments on the command semantics.

Each log entry has certain fields (table columns), which are represented in Table 1.

Table 1 — Description of log fields

Field	Description
`instant`	Event time in the format `YYYY-MM-DD hh:mm:ss,sss`
`cluster`	Short cluster name
`method`	Command (see the above list)
`path` (see Note)	Path passed to the command as an argument
`original_path` (see Note)	Path passed to the command as an argument
`destination_path` (see Note)	Destination path for the Copy, Move, and Link commands (not applicable to other commands)
`original_destination_path` (see Note)	Destination path for the Copy, Move, and Link commands (not applicable to other commands)
`user`	User who gave the command
`type`	Type of node created with the Create command (not applicable to other commands)
`transaction_info`	Information about the transaction where the command was executed (not applicable to cases where the command was executed outside of a transaction)

Note

The difference between original_path and path (as well as between original_destination_path and destination_path) is as follows:

If a link (symbolic link) was specified as a path, original_path will contain the path to the link, whereas path will contain the actual path to the node.
If this path leads to a shard, its log will feature the actual path under original_path, while the path relative to the root of the shard will be written in path.

Overall, this means that if you grep a relative path to a symbolic link, the command will always return entries containing the actual path to the node, while grepping the actual path finds accesses, including via symbolic links. The key takeaway here is that you should search both by path and by original_path.

The structure of the transaction_info field is shown in Table 2.

Table 2 — Structure of the transaction_info field

Field	Description
`transaction_id`	Transaction ID
`transaction_title`	Human-readable description of the transaction (specified by the client upon transaction start; the field is missing if the description wasn't specified)
`operation_id`	ID of the operation associated with the transaction
`operation_title`	Human-readable description of the operation associated with the transaction
`parent`	For a nested transaction, the description of its parent (for top-level transactions, the field is missing)

Please note that the parent field is structured the same way as transaction_info. Thus, transaction_info contains the full recursive description of the ancestry of the transaction where the command was run.

Notes

The reading and writing of metadata needs to be distinguished from the reading and writing of data (chunks) to tables and files.
From a master's point of view, data reads/writes look like the following sequence of commands:
- Reading:
  - GetBasicAttributes: Getting some service attributes necessary for reading.
  - Fetch: Getting a list of chunks that make up the file or table.
- Writing:
  - GetBasicAttributes: Getting some service attributes necessary for writing.
  - BeginUpload: Starting the upload transaction.
  - EndUpload: Completing the upload transaction.
When reading/writing data, the GetBasicAttributes command targets one cell, while Fetch, BeginUpload, and EndUpload target another — this is normal.
In most cases, copying or moving a table looks like the Copy or Move commands. The BeginCopy and EndCopy commands are used when copying/moving crosses the Cypress sharding boundaries. In practice, such cases are rare.

HTTP Proxy request log

This log contains entries for all requests handled by the HTTP proxy.

Table 2 — Description of log fields

Field	Description
`instant`	Event time in the format `YYYY-MM-DD hh:mm:ss,sss`
`cluster`	Short name of the cluster
`request_id`	Identifier of the request
`correlation_id`	Special request identifier generated by the client and unchanged in case of retries
`user`	User making the request
`method`	HTTP request method
`http_path`	HTTP request path
`user_agent`	Contents of the `User-Agent` header in the request
`command`	Command
`parameters`	Parameters of the command
`path`	Value of the `path` parameter
`error`	Structured description of the error if the request failed
`error_code`	Error code if the request failed
`http_code`	HTTP response code for the request
`start_time`	Actual start time of the request execution on the proxy
`cpu_time`	Time spent by the proxy for request execution (excluding time spent in other cluster components)
`duration`	Total duration of the request
`in_bytes`	Size of the request data in bytes
`out_bytes`	Size of the response data in bytes
`remote_address`	Address from which the request originated

Configuring log rotation

For debug and structured logs written to a file, you can configure the built-in rotation mechanism (the rotationPolicy field). The rotation settings are detailed in the table. If the useTimestampSuffix option isn't enabled, an index number is appended to the file names of old segments on rotation.

Table 3 — Log rotation settings

Field	Description
`rotationPeriodMilliseconds`	Rotation period in milliseconds. Can be set together with `maxSegmentSize`.
`maxSegmentSize`	Log segment size limit in bytes. Can be set together with `rotationPeriodMilliseconds`.
`maxTotalSizeToKeep`	Total segment size limit in bytes. At the time of rotation, the oldest logs are deleted to meet the limit.
`maxSegmentCountToKeep`	Limit on the number of stored log segments. The oldest segments over the limit are deleted.

Dynamic configuration

Components that support dynamic configuration let you further refine the logging system settings using the dynamic config in Cypress (logging section).

Basic parameters:

enable_anchor_profiling — enables Prometheus metrics for individual record prefixes
min_logged_message_rate_to_profile — the minimum message frequency for inclusion in a separate metric
suppressed_messaged — a list of debug log message prefixes to be excluded from logging

Configuration example:

{
  logging = {
    enable_anchor_profiling = %true;
    min_logged_message_rate_to_profile = 100;
    suppressed_messaged = [
      "Skipping out of turn block",
      "Request attempt started",
      "Request attempt acknowledged"
    ];
  }
}

Sample logging settings

  primaryMasters:
    ...
    loggers:
	  - name: debug
        compression: zstd
        minLogLevel: debug
        writerType: file
        rotationPolicy:
          maxTotalSizeToKeep: 50_000_000_000
          rotationPeriodMilliseconds: 900000
        categoriesFilter:
          type: exclude
          values: ["Bus", "Concurrency", "ReaderMemoryManager"]
      - name: info
        minLogLevel: info
        writerType: file
        rotationPolicy:
    	  maxTotalSizeToKeep: 10_000_000_000
          rotationPeriodMilliseconds: 900000
      - name: error
        minLogLevel: error
        writerType: stderr
    structuredLoggers:
      - name: access
        minLogLevel: info
        category: Access
        rotationPolicy:
          maxTotalSizeToKeep: 5_000_000_000
          rotationPeriodMilliseconds: 900000
    locations:
      - locationType: Logs
        path: /yt/logs
      - ...

    volumeMounts:
      - name: master-logs
        mountPath: /yt/logs
      - ...

    volumeClaimTemplates:
      - metadata:
          name: master-logs
        spec:
          accessModes: [ "ReadWriteOnce" ]
          resources:
            requests:
              storage: 100Gi
      - ...

Setting up locations

CHYT