The main object types in YTsaurus are:
The description of objects takes up most of the meta-information which is stored in replicated form in the memory of the Cypress master servers. Cypress contains various system information as well as indications of where user data is stored.
For any Cypress node, there may be several versions, because the state of the node may look different in the context of each transaction.
These are Cypress nodes of the file type designed to store large binary data in the system.
Files consist of multiple chunks and are chunk owners.
Chunks are organized as a special tree-like data structure in which leaves are chunks and intermediate nodes are lists of chunks.
For more information, see Files.
These are nodes of the table type designed to store large user data. There are two types of tables: static and dynamic. Logically, a table is a sequence of rows. Each row consists of columns.
Column values are typified and can contain random structured data supported by the YSON language.
Links are nodes of the link type that refer to other objects along the path specified when the object is created.
All accesses to the node links are automatically redirected to the target object.
For more information, see Links.
A YSON document is a Cypress node of the document type designed to store random YSON structures.
Requests and modifications within the document are supported. Addressing within the document is performed using the YPath language.
For more information, see YSON documents.
For more information about the YSON format, see YSON.
A Cypress node can contain a value of the primitive type: string, number, dict, list. The table contains a description of the primitive types of Cypress nodes.
||A Cypress node containing a string|
||A Cypress node containing a signed integer|
||A Cypress node containing an unsigned integer|
||A Cypress node containing a real number|
||Dict in Cypress (keys are strings, values are other nodes). By analogy with a file system, this is a folder|
||An ordered list in Cypress (values are other nodes)|
||A Cypress node containing a Boolean value|
In addition to the objects listed above, there are internal objects: transactions, chunks, accounts, users, and groups. Such objects are not subject to versioning.
The internal objects listed in the table:
Each object has a unique ID that is a 128-bit number whose format coincides with the GUID. This number can be represented as four 32-bit numbers:
a-b-c-d. Usually when an ID is typed, components
d are written in hexadecimal form. Each of these components has its own meaning and is described in the table.
||The number of the epoch in which the object was created|
||The mutation number within the epoch in which the object was created|
||The master group ID (cell id) is a unique 16-bit number that unambiguously identifies the cluster. This ID can be found in the cluster web interface opposite the master clusters|
||Hash is a pseudo-unique random number selected at the time of object creation.|
Thus, the object ID depends on the cluster on which the object was created, as well as on the point in time (in terms of mutation numbers) when the object was created. More than one object can be created within a single mutation. They will differ in hash value. This difference is guaranteed by the hash generation mechanism. ID example:
All system objects have the attributes listed in the table:
||The number of strong links to the object|
||List of supported access permissions|
||Effective object ACL|
Some system objects have additional attributes:
owner. They affect access to this object. These attributes are called an access control descriptor (ACD). For more information about the ACD, see Managing access.
Reference counters are used to track the lifetime of an object. An object lives in the system as long as there are strong links to it. The number of strong links can be found in the ref_counter attribute. As soon as this number reaches zero, the object turns into a zombie and is subject to removal. It is not removed instantly: there is a special removal queue (GC queue) from which zombies are taken in portions of a controlled size and destroyed. The size is set by the internal configuration parameters of the system. Once the object has become a zombie, you can no longer access it, even by explicitly building a path by ID.