YTsaurus

YTsaurus (IPA: [waɪtiːsɔːrəs], pronounced wai-tee-saw-ruhs) is a distributed storage and processing platform for large amounts of data. It includes MapReduce computation model, a distributed file system and a NoSQL key-value storage.

Overview

System overview: YTsaurus purpose and key features of the platform.

Data storage

Storing data in YTsaurus: Cypress metadata storage, key system entities, static tables, transactions, data storage formats.

How to try?

Examples of basic actions with YTsaurus in the CLI and web interface.

Dynamic tables

NoSQL key-value database: transactions, query language, replicated dynamic tables.

API and reference

Commands and their parameters, SDK description, and sample code for platform interaction.

Data processing

Processing data with YTsaurus: scheduler, MapReduce paradigm, operations supported.

  • YQL: A declarative SQL-like query language.
  • CHYT: A ClickHouse cluster running in YTsaurus.
  • SPYT: An Apache Spark cluster running in YTsaurus.