YTsaurus

YTsaurus is a distributed storage and processing platform for large amounts of data. It includes MapReduce computation model, a distributed file system and a NoSQL key-value storage.

Overview

System overview: YTsaurus purpose and key features of the platform.

Data storage

Storing data in YTsaurus: Cypress metadata storage, key system entities, static tables, transactions, data storage formats.

How to try?

Step-by-step tutorial on how to quickly deploy a YTsaurus instance.

Dynamic tables

NoSQL key-value database: transactions, query language, replicated dynamic tables.

API and reference

Commands and their parameters, SDK description, and sample code for platform interaction.

Data processing

Processing data with YTsaurus: scheduler, MapReduce paradigm, operations supported.

  • YQL: A declarative SQL-like query language.
  • CHYT: A ClickHouse cluster running in YTsaurus.
  • SPYT: An Apache Spark cluster running in YTsaurus.