Metadata tree
This section describes Cypress, the meta information tree. Cypress contains various system information as well as indications of where user data is stored. This section includes three subsections describing a general tree view, Cypress node attributes, and TTL for Cypress nodes.
General tree view
To a user, Cypress looks like a Linux file system tree but with a number of significant differences. First, every tree node has an associated collection of attributes, including some user-defined ones. Second, the tree is transactional. Third, files and directories, as well as other objects, can serve as Cypress nodes. Like a file system, Cypress supports an access control system.
Cypress is rooted at /
which has map_node type (that is, it's a directory). Cypress nodes are addressed using YPath.
Example paths: //tmp
is the temporary directory, //tmp/@
is a pointer to the directory attributes, //tmp/table/@type
is the path to the type
attribute of the //tmp/table
node.
Using YPath, you can represent Cypress as follows:
/
/home
/user1
/table
/@id
/@chunk_ids
/@type
...
...
/user2
...
/tmp
/sys
/chunks
...
...
...
You can manipulate Cypress via a CLI.
Cypress node attributes
In addition to attributes common to all objects, Cypress nodes have additional attributes listed in the table:
Attribute | Type | Value |
---|---|---|
parent_id |
string |
Parent node ID (none for the root) |
locks |
array<Lock> |
List of locks taken out on a node |
lock_mode |
LockMode |
Current node lock mode (transaction-dependent) |
path |
string |
Node absolute path |
key |
string |
Key to access this node in its parent folder (if the node is so nested) |
creation_time |
DateTime |
Node create time |
modification_time |
DateTime |
Node most recent modification time |
access_time |
DateTime |
Node most recent access time |
expiration_time |
DateTime |
Time automatically to delete a node. Optional attribute |
expiration_timeout |
DateTime |
A timeout for the automatic deletion of a node if it has not been accessed. Optional attribute |
access_counter |
integer |
Number of times a node has been accessed since being created |
revision |
integer |
Node revision |
resource_usage |
ClusterResources |
Cluster resources appropriated by a node |
recursive_resource_usage |
ClusterResources |
Cluster resources appropriated by a node and its entire subtree |
account |
string |
Account used to keep track of the resources being used by a specific node |
annotation |
string |
Human-readable summary description of an object |
Each node has its own attribute responsible for access control. Therefore, its attributes include inherit_acl
, acl
, and owner
. For more information, see Access control system.
Time attributes
The creation_time
attribute stores the node create time. The modification_time
attribute stores the time of the last update of the node and node attribute. modification_time
does not track child node updates, that is, modification_time
for map_node
does not change if there are changes somewhere deep in the tree.
When a node is created and every time a node is modified, the system updates its revision
attribute. It stores a non-negative integer. The revision number is guaranteed to increase in a strictly monotonous manner over time. You can use revisions to verify that a node has not updated. revision
updates together with modification_time
.
The access_time
attribute stores the most recent node access time. Attribute access does not count. In addition, to improve performance, the system does not update this attribute for every access transaction but rather accumulates such transactions and updates access_time
approximately once per second.
Attention
In rare cases, an attribute may have been accessed without an access_time
update because of a master server fault.
Most commands used for reads and writes include the suppress_access_tracking
and the suppress_modification_tracking
options that disable access_time
, modification_time
, and revision
updates, respectively. for reading and writing. In particular, the web interface uses suppress_access_tracking
, so viewing the contents via the web UI doesn't trigger access_time
updates.
Note
In the event that a transaction creates or modifies a node, the above attributes are set once during updates within the transaction. Thus, a node may become visible in parent transactions much later than its creation_time
: only after a commit of the relevant transaction.
Cypress node TTL
Cypress can delete nodes automatically at a specified moment in time or if nodes are not accessed for a certain length of time. This feature is controlled by the expiration_time
and the expiration_timeout
attributes. By default, these attributes are not there, so the system will not delete a node automatically. For TTL to function, you need:
- to set
expiration_time
to a moment in time when the node is to be deleted. If it is a composite node, this will also delete its entire subtree. - to set
expiration_timeout
to a time interval during which there have to be no attempts to access the node (and its entire subtree if it is a composite node) for it to be deleted.
The moment in time has to be either an isoformat string or an integer denoting the number of milliseconds since the epoch. These two methods are equivalent:
yt set //home/project/path/table/@expiration_time '"2020-05-16 15:12:34.591+03:00"'
yt set //home/project/path/table/@expiration_time '1589631154591'
A time interval is specified in milliseconds:
# Delete a node if "left alone" for a week.
yt set //home/project/path/table/@expiration_timeout 604800000
Attention
You cannot restore data deleted using this mechanism. Use it with caution.
You can modify these attributes within transactions; however, only their committed values will take effect.
To be able to set these attributes for a node, you need to have the right to write
to the node itself same as for many other attributes as well as the remove
privilege to the node and its entire subtree because a delete is being requested in effect, albeit a deferred one. The write
privilege is sufficient to delete these attributes.
The system provides no guarantee that the delete will occur exactly at the time requested. In real life, the delete occurs within single seconds of the specified moment in time.
A node is not automatically deleted if at the specified moment in time it is subject to locks other than snapshot
. The system will delete the node when all locks are released. You can use this property to extend a node's time-to-live artificially.
When you copy and move a node, expiration_time
and expiration_timeout
are reset by default, so the copy will not automatically delete. Commands include the preserve-expiration-time
and the preserve-expiration-timeout
options that enable you to change their behavior.
Attention
A number of API calls that create temporary tables set such tables' expiration_time
/expiration_timeout
to purge them automatically. You must keep that in mind and not store important data in such tables.
Deletion may occur earlier if the node is located in a subtree with a smaller expiration_time
/expiration_timeout
value at the root. To get the actual deletion time of a node, use the effective_expiration
attribute:
$ yt get //home/project/path/table/@effective_expiration
{
"time": {"value": 42, "path": //testator/path}
"timeout": {"value": 42, "path": //testator/path}
}
If the path from the root to the node doesn't contain expiration_time
or another relevant attribute, a YSON entity is written to the “time”
field instead.