YSON

This section contains information about YSON, a JSON-like data format.

The main differences between YSON and JSON are:

  1. Support for binary representation of scalar types (numbers, strings, and boolean types).
  2. Attributes: Attributes are an arbitrary dict which can be set additionally on a literal of any type, including scalar ones.

Besides that, there are syntactic differences:

  1. A semicolon is used as a separator instead of a comma.
  2. In dicts, a key is not separated from a value by a colon, but by an equal sign: =.
  3. String literals do not always have to be enclosed in quotes, but only if there is a parsing ambiguity.

The following set of scalar types is available:

  1. Strings (string).
  2. Signed and unsigned 64-bit integers (int64 and uint64 ).
  3. Double-precision floating-point numbers (double).
  4. Boolean (logical) type (boolean).
  5. A special entity type with only one value, a literal #.

Scalar types usually have both textual and binary representation.

There are two composite types:

  1. List (list).
  2. Dict (map).

Scalar types

Strings

There are three types of string tokens:

  1. Identifiers are strings that match the regular expression [A-Za-z_][A-Za-z0-9_.\-]*. It describes the set of possible C identifiers, extended with the - and . characters. An identifier specifies a string with identical content and is used primarily for brevity (no need to use quotes).

    Examples:

    • abc123;
    • _;
    • a-b.
  2. Text strings: C-escaped strings in double quotes.

    Examples

    • "abc123";
    • "";
    • "quotation-mark: \", backslash: \\, tab: \t, unicode: \xEA".
  3. Binary strings: \x01 + length (protobuf sint32 wire format) + data (<length> bytes).

Signed 64-bit integers (int64)

Two methods of writing:

  1. Text (0, 123, -123, +123).
  2. Binary: \x02 + value (protobuf sint64 wire format).

Unsigned 64-bit integers (uint64)

Two methods of writing:

  1. Text (10000000000000, 123u).
  2. Binary: \x06 + value (protobuf uint64 wire format).

Floating-point numbers (double)

Two methods of writing:

  1. Text: 0.0, -1.0, 1e-9, 1.5E+9, 32E1.
  2. Binary: \x03 + protobuf double wire format.

Attention!

Textual representation of floating-point numbers involves rounding. The result value may become different after a round of serialization and parsing. To store an accurate value, use binary representation.

Boolean literals (boolean)

Two methods of writing:

  1. Text (%false, %true).
  2. Binary (\x04, \x05).

Entity (entity)

Entity is an atomic scalar value with no content of its own. There are various scenarios in which this type can be useful. For example, entity often means null. In addition, when a get request is made to a Cypress subtree, files and tables are returned as entities (actual data is stored in the attributes of that node).

Lexically, entity is encoded by the # symbol.

Special literals

Special tokens:
;, =, #, [, ], {, }, <, >, ), /, @, !, +, ^, :, ,, ~.
Not all of these symbols are used in YSON, some are used in YPath.

Composite types

List (list)

Set as follows: [value; ...; value] where value is a literal of some scalar or composite type.

Example: [1; "hello"; {a=1; b=2}].

Dict (map)

Set as follows: {key = value; ...; key = value}. Here *key* is a string literal and value is a literals of some scalar or composite type.

Example: {a = "hello"; "38 parrots" = [38]}.

Attributes

It is possible to set attributes on any literal in YSON, in the following format: <key = value; ...; key = value> value. Inside angle brackets, syntax is similar to the dict. For example, <a = 10; b = [7,7,8]>"some-string" or <"44" = 44>44. But most often attributes can be found on literals like entity, for example, <id="aaad6921-b5704588-17990259-7b88bad3">#.

Working with YSON from code

Users usually do not have to work directly with YSON. When using one of the official YTsaurus clients, YSON structures will be expressed as follows:

  1. C++:TNode is a class that provides dynamic DOM-like representation of a YSON document.
  2. Python: YsonType: YSON types mimic Python types. You can get YSON attributes of object x like this: x.attributes, this is a Python dict.
  3. Java: YTreeNode is an interface that provides dynamic DOM-like representation of a YSON document.
Previous
Next