PRAGMA

Definition

Redefinition of settings.

Syntax

PRAGMA x.y = "z"; или PRAGMA x.y("z", "z2", "z3");:

  • x: (optional) The category of the setting.
  • y: The name of the setting.
  • z: (optional for flags) The value of the setting. The following suffixes are acceptable:
    • Kb, Mb, Gb: For the data amounts.
    • sec,min, h, d: For the time values.

Examples

PRAGMA AutoCommit;
PRAGMA TablePathPrefix = "home/yql";
PRAGMA Warning("disable", "1101");

With some exceptions, you can return the settings values to their default states using PRAGMA my_pragma = default;.

For the complete list of available settings, see the table below.

Scope

Unless otherwise specified, a pragma affects all the subsequent expressions up to the end of the module where it's used.
If necessary and logically possible, you can change the value of this setting several times within a given query to make it different at different execution steps.
There are also special scoped pragmas with the scope defined by the same rules as the scope of named expressions.
Unlike scoped pragmas, regular pragmas can only be used in the global scope of visibility (not inside lambda functions, ACTION, SUBQUERY, etc.).

Global

AutoCommit

Value type By default
Flag false

Automatically perform COMMIT after each expression.

TablePathPrefix

Value type By default
String

Add the specified prefix to the cluster table paths. Works on the same principle as merging paths in a file system: supports references to the parent catalog .. and doesn't require adding a slash to the right. For example,

PRAGMA TablePathPrefix = "home/yql"; SELECT * FROM test;

The prefix is not added if the table name is an absolute path (starts with /).

UseTablePrefixForEach

Value type By default
Flag false

EACH uses TablePathPrefix for each list item.

Warning

Value type By default
1. Action
2. Warning code or "*"

Action:

  • disable: Disable.
  • error: Treat as an error.
  • default: Revert to the default behavior.

The warning code is returned with the text itself (it's displayed on the right side of the web interface).

Example:
PRAGMA Warning("error", "*");
PRAGMA Warning("disable", "1101");
PRAGMA Warning("default", "4503");

In this case, all the warnings are treated as errors, except for the warning 1101 (that will be disabled) and 4503 (that will be processed by default, that is, remain a warning). Since warnings may be added in new YQL releases, use PRAGMA Warning("error", "*"); with caution (at least cover such queries with autotests).

List of warning and error codes

Greetings

Value type By default
Text

Issue the specified text as the query's Info message.

Example:
PRAGMA Greetings("It's a good day!");

WarningMsg

Value type By default
Text

Issue the specified text as the query's Warning message.

Example:
PRAGMA WarningMsg("Attention!");

DqEngine

Value type By default
disable/auto/force string "auto"

When set to "auto", it enables a new compute engine. Computing is made, whenever possible, without creating map/reduce operations. The "force" value unconditionally routes calculations to the new engine.

SimpleColumns

SimpleColumns / DisableSimpleColumns

Value type By default
Flag true

If using SELECT foo.* FROM ... AS foo, delete the foo. prefix from the names of the resulting columns.

It also works for JOIN, but in this case it may crash if there's a name conflict (which can be resolved through WITHOUT and renaming columns). For JOIN in SimpleColumns mode, an implicit Coalesce is made for key columns: the query SELECT * FROM T1 AS a JOIN T2 AS b USING(key) in the SimpleColumns mode works same as SELECT a.key ?? b.key AS key, ... FROM T1 AS a JOIN T2 AS b USING(key)

CoalesceJoinKeysOnQualifiedAll

CoalesceJoinKeysOnQualifiedAll / DisableCoalesceJoinKeysOnQualifiedAll

Value type By default
Flag true

Controls implicit Coalesce for the key JOIN columns in the SimpleColumns mode. If the flag is set, the Coalesce is made for key columns if there is at least one expression in the format foo.* or * in SELECT: for example, SELECT a.* FROM T1 AS a JOIN T2 AS b USING(key). If the flag is not set, then Coalesce for JOIN keys is made only if there is an asterisk '*' after SELECT

StrictJoinKeyTypes

StrictJoinKeyTypes / DisableStrictJoinKeyTypes

Value type By default
Flag false

If the flag is set, JOIN will require a strict match of key types.
By default, JOIN preconverts keys to a shared type, which might result in performance degradation.
StrictJoinKeyTypes is a scoped setting.

AnsiInForEmptyOrNullableItemsCollections

Value type By default
Flag false

This pragma brings the behavior of the IN operator in accordance with the standard when there's NULL in the left or right side of IN. The behavior of IN when on the right side there is a Tuple with elements of different types also changed. Examples:

1 IN (2, 3, NULL) = NULL (was Just(False))
NULL IN () = Just(False) (was NULL)
(1, null) IN ((2, 2), (3, 3)) = Just(False) (was NULL)
2147483648u IN (1, 2147483648u) = True (was False)

For more information about the IN behavior when operands include NULLs, see here. You can explicitly select the old behavior by specifying the pragma DisableAnsiInForEmptyOrNullableItemsCollections. If no pragma is set, then a warning is issued and the old version works.

AnsiRankForNullableKeys

Value type By default
Flag false

Aligns the RANK/DENSE_RANK behavior with the standard if there are optional types in the window sort keys or in the argument of such window functions. It means that:

  • The result type is always Uint64 rather than Uint64?;
  • NULLs in keys are treated as equal to each other (the current implementation returns NULL).
    You can explicitly select the old behavior by using the DisableAnsiRankForNullableKeys pragma. If no pragma is set, then a warning is issued and the old version works.

AnsiCurrentRow

Value type By default
Flag false

If ORDER BY is present, the implicit window frame task is brought into conformity with the standard.
If AnsiCurrentRow is not set, the (ORDER BY key) window is equivalent to (ORDER BY key ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW).
The standard requires that such window behaves as (ORDER BY key RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW).
The difference is in CURRENT ROW interpretation. In ROWS mode, CURRENT ROW is interpreted literally: the current string in the partition.
And in RANGE mode, the end of the CURRENT ROW frame means "the last row in the partition with the sorting key equal to the current row".

DisableAnsiOptionalAs

Value type By default
Flag false

With this pragma, syntax error will be raised for queries wherein the fields are not separated by commas.
The following query:

SELECT
    field1 -- no "," here
    field2
FROM (
    select 1 AS field1
);

will raise "Expecting mandatory AS here. Did you miss comma?" error.

OrderedColumns

OrderedColumns / DisableOrderedColumns

Output the sequence of columns to SELECT/JOIN/UNION ALL and save it when recording the results. The order of columns is undefined by default.

PositionalUnionAll

Enable the standard column-by-column execution for UNION ALL. In this case,
column ordering is enabled automatically.

RegexUseRe2

Value type By default
Flag false

Use Re2 UDF instead of Pcre for executing REGEX,MATCH,RLIKE, and SQL operators. Re2 UDF supports correct processing of Unicode symbols, unlike Pcre UDF, which is used by default.

ClassicDivision

Value type By default
Flag true

In the classical version, the result of integer division remains integer (by default).
If disabled, the result is always Double.
ClassicDivision is a scoped setting.

CheckedOps

Value type By default
Flag false

When the mode is enabled, if integers go beyond the limits of the target argument type or result when performing SUM/SUM_IF, +,-,*,/,% binary operations, or unary operation-, then NULL is returned.
If disabled, overflow is not checked.
Has no effect on floating point or Decimal numbers.
CheckedOps is a scoped setting.

AllowDotInAlias

Value type By default
Flag false

Enable dot in names of result columns. This behavior is disabled by default, since the further use of such columns in JOIN is not fully implemented.

WarnUnnamedColumns

Value type By default
Flag false

Generate a warning if a column name was automatically generated for an unnamed expression in SELECT (in the format column[0-9]+).

GroupByLimit

Value type By default
Positive number 32

Increasing the limit on the number of dimensions in GROUP BY.

GroupByCubeLimit

Value type By default
Positive number 5

Increasing the limit on the number of dimensions in GROUP BY.

Use this option with care, because the computational complexity of the query grows exponentially with the number of dimensions.

Yson

Management of Yson UDF default behavior. To learn more, see documentation, in particular, Yson::Options.

yson.AutoConvert

Value type By default
Flag false

Automatic conversion of values to the required data type in all Yson UDF calls, including implicit calls.

yson.Strict

Value type By default
Flag true

Strict mode control in all Yson UDF calls, including implicit calls. If the value is omitted or is "true", it enables the strict mode. If the value is "false", it disables the strict mode.

yson.DisableStrict

Value type By default
Flag false

An inverted version of yson.Strict. If the value is omitted or is "true", it disables the strict mode. If the value is "false", it enables the strict mode.

Working with files

File

Value type By default Static/
dynamic
Two or three string arguments — alias, URL, and optional token name Static

Attach a file to the query by URL. For attaching files you can use the built-in functions FilePath and FileContent. This PRAGMA is a universal alternative to attaching files using built-in mechanisms of web or console clients.

YQL reserves the right to cache files at the URL for an indefinite period, hence, if there is a significant change in the content behind it, we strongly recommended to modify the URL by adding/editing dummy parameters.

When specifying the token name, its value will be used to access the target system.

Folder

Value type By default Static/
dynamic
Two or three string arguments — prefix, URL, and optional token name Static

Attach a set of files to the query by URL. Functions similar to adding a set of files using PRAGMA File via direct links to files with aliases obtained by joining a prefix with the file name via /.

When specifying the token name, its value will be used to access the target system.

Library

Value type By default Static/
dynamic
One or two arguments: the file name and an optional URL Static

Treat the specified attached file as a library from which you can do IMPORT. The syntax type for the library is determined from the file extension:

  • .sql: For the YQL dialect of SQL (recommended).
  • .yql for s-expressions.

Example with a file attached to the query:

PRAGMA library("a.sql");
IMPORT a SYMBOLS $x;
SELECT $x;

If the URL is specified, the library is downloaded from the URL rather than from the pre-attached file as in the following example:

PRAGMA library("a.sql","https://paste.yandex-team.ru/5618566/text");
IMPORT a SYMBOLS $x;
SELECT $x;

In this case, you can use text parameter value substitution in the URL:

DECLARE $_ver AS STRING; -- "5618566"
PRAGMA library("a.sql","https://paste.yandex-team.ru/{$_ver}/text");
IMPORT a SYMBOLS $x;
SELECT $x;

YTsaurus

YTsaurus pragmas may be defined as static or dynamic based on their lifetimes. Static pragmas are initialized one time at the earliest query processing step. If a static pragma is specified multiple times in a query, it accepts the latest value set for it. Dynamic pragma values are initialized at the query execution step after its optimization and roadmap development. The specified value is valid until the next identical pragma is found or until the query is completed. For dynamic pragmas only, you can reset their values to the default by assigning a default. All pragmas that affect query optimizers are static because dynamic pragma values haven't yet been calculated at this step.

Value type By default Static/
dynamic

UDF

Value type By default Static/
dynamic
String Static
String — prefix name appended to all modules "" Static

Importing all UDFs from the shared library (.so) compiled in Linux x64 that is attached to the query.
When setting a prefix, it's appended before the names of all loaded modules, e.g. CustomPrefixIp::IsIPv4 instead of Ip::IsIPv4.
Setting the prefix lets you load the same UDF for different versions.

yt.InferSchema / yt.ForceInferSchema

Value type By default Static/
dynamic
A number from 1 to 1,000 Static

Outputting the data schema based on the contents of the table's first rows. If PRAGMA is specified without a value, the contents of only the first row is used. If multiple rows are specified and column data types differ, they are extended to Yson.

InferSchema includes outputting data schemas for those tables only where it's not specified in metadata at all. When using ForceInferSchema, the data schema from metadata is ignored except for the list of key columns for sorted tables.

In addition to the detected column, dictionary column _other (row-on-row) is generated, which contains values for those columns that weren't in the first row but were found somewhere else. This lets you use WeakField for such tables.

Due to a wide range of issues that may arise, this mode isn't recommended for use and is disabled by default.

yt.InferSchemaTableCountThreshold

Value type By default Static/
dynamic
Positive number 50 Static

If the number of tables for which the schema is outputted based on their contents exceeds the specified value, then schema outputting is initiated as a separate operation on YTsaurus, which may happen much faster.

yt.IgnoreWeakSchema

Value type By default Static/
dynamic
Flag false Static

Ignore the table's weak schema (produced by sorting a non-schematized table based on a set of fields).

Together with yt.InferSchema, you can output data-based schemas for such tables.

yt.IgnoreYamrDsv

Value type By default Static/
dynamic
Flag false Static

Ignore _format=yamred_dsv if it's specified in the input table's metadata.

yt.IgnoreTypeV3

Value type By default Static/
dynamic
Flag false Static

When reading tables with type_v3 schema, all fields containing complex types will be displayed as Yson fields in the query. Complex types include all non-data types and data types with more than one level of optionality.

yt.StaticPool

Value type By default Static/
dynamic
String Current user login Static

Selecting a computing pool in the scheduler for operations performed at the optimization step.

yt.Pool

Value type By default Static/
dynamic
String yt.StaticPool if set, or the current user login Dynamic

Selecting a computing pool in the scheduler for regular query operations.

yt.Owners

Value type By default Static/
dynamic
A string containing the list of logins separated by any of these symbols: comma, semicolon, space or ` `

Lets you grant management permissions for operations created by MapReduce in YTsaurus (cancel, pause, run-job-shell, etc.) to any users other than the YQL operation owner.

yt.OperationReaders

Value type By default Static/
dynamic
A string containing the list of logins separated by any of these symbols: comma, semicolon, space or ` `

Lets you grant read permissions for operations created by MapReduce in YTsaurus to any users other than the YQL operation owner.

yt.Auth

Value type By default Static/
dynamic
String Static

Use authentication data other than the default data.

yt.DefaultMaxJobFails

Value type By default Static/
dynamic
Positive number 5 Static

The number of failed MapReduce jobs, upon reaching which query execution retries are stopped and the query is considered failed.

yt.DefaultMemoryLimit

Value type By default Static/
dynamic
Bytes 512M Dynamic

Limitation of memory utilization (bytes) by jobs, which is ordered when launching MapReduce operations.

You can use K, M, and G suffixes to specify values in kilobytes, megabytes, and gigabytes, respectively.

yt.DataSizePerJob / yt.DataSizePerSortJob / yt.DataSizePerMapJob

Value type By default Static/
dynamic
Bytes 1G Dynamic

Managing the splitting of MapReduce operations into jobs, the larger the number, the fewer jobs. Use a lower value for computing-intensive jobs. Use a higher value for jobs that scan through a large amount of data (namely, user_sessions).

You can use K, M, and G suffixes to specify values in kilobytes, megabytes, and gigabytes, respectively.

yt.DataSizePerPartition

Value type By default Static/
dynamic
Bytes Dynamic

Management of partition sizes in MapReduce operations.

You can use K, M, and G suffixes to specify values in kilobytes, megabytes, and gigabytes, respectively.

yt.MaxJobCount

Value type By default Static/
dynamic
Positive number, less than 100 thousand. 16384 Dynamic

Maximum number of jobs within a single MapReduce operation after DataSizePerJob. If splitting by size results in too many jobs, in fact, the number of jobs being run is equal to MaxJobCount.

yt.UserSlots

Value type By default Static/
dynamic
Positive number No limits Dynamic

Upper limit on the number of concurrent jobs within a MapReduce operation.

yt.DefaultOperationWeight

Value type By default Static/
dynamic
Floating-point number 1.0 Dynamic

Weight of all launched MapReduce operations in a selected computing pool.

yt.TmpFolder

Value type By default Static/
dynamic
String //tmp/yql/<login> Static

Directory for storing temporary tables and files.

yt.TablesTmpFolder

Value type By default Static/
dynamic
String //tmp/yql/<login> Static

Directory for storing temporary tables. Takes priority over yt.TmpFolder.

yt.TempTablesTtl

Value type By default Static/
dynamic
Time interval supporting s/m/h/d suffixes Static

Allows management of TTL for temporary tables. Effective for tables containing a full result, while the other temporary tables are unconditionally removed upon completion of the query regardless of this pragma.

yt.FileCacheTtl

Value type By default Static/
dynamic
Time interval supporting s/m/h/d suffixes 7d Static

Allows management of TTL for file cacheYTsaurus. Value of 0 disables use of TTL for file cache.

yt.IntermediateAccount

Value type By default Static/
dynamic
Account name in YTsaurus intermediate Dynamic

Allows use of your account for intermediate data in a unified MapReduce operation.

The common account, which can overflow at an unfortunate time, is the default.

If PRAGMA yt.TmpFolder is set, then instead of the common account you can use the one specified in the temporary directory.

yt.IntermediateReplicationFactor

Value type By default Static/
dynamic
A number from 1 to 10 Dynamic

Intermediate data replication factor.

yt.PublishedReplicationFactor / yt.TemporaryReplicationFactor

Value type By default Static/
dynamic
A number from 1 to 10 Dynamic

Replication factor for tables created through YQL.

Tables specified in INSERT INTO are Published. All other tables are Temporary.

yt.ExternalTx

Value type By default Static/
dynamic
String Static

Running an operation in a transaction that has already been launched outside YQL. Its identifier is passed to the value.

All directories required for running the query are created in a specified transaction. This may cause conflicts when attempting to write data from two queries with different ExternalTx into a previously non-existent directory.

yt.OptimizeFor

Value type By default Static/
dynamic
String: lookup/scan scan Dynamic

Management of optimize_for for the tables being created.

yt.PublishedCompressionCodec / yt.TemporaryCompressionCodec

Value type By default Static/
dynamic
String zstd_5 Dynamic

Compression settings for tables created through YQL.

Tables specified in INSERT INTO are Published. All other tables are Temporary. Also, a codec specified as Temporary is used for intermediate data in a single YTsaurus operation (e.g. unified MapReduce).

yt.PublishedErasureCodec / yt.TemporaryErasureCodec

Value type By default Static/
dynamic
String none Dynamic

Erasure coding is always disabled by default. To enable it, you should use a value of lrc_12_2_2.

The difference between Published and Temporary is similar to CompressionCodec.

yt.NightlyCompress

Value type By default Static/
dynamic
Flag false Dynamic

Set an @force_nightly_compress attribute for newly created tables. A background process (if any) in YTsaurus re-compresses tables at night so they take less time.

yt.ExpirationDeadline / yt.ExpirationInterval

Value type By default Static/
dynamic
ExpirationDeadline: point in time in ISO 8601 format. ExpirationInterval: time interval supporting s/m/h/d suffixes that counts from the transaction commit time. Dynamic

Allows management of TTL for tables created by the operation.

yt.MaxRowWeight

Value type By default Static/
dynamic
Bytes, up to 128M 16M Dynamic

Increase the maximum table row length limit in yt.

yt.MaxKeyWeight

Value type By default Static/
dynamic
Bytes, up to 256K 16K Dynamic

Increase the maximum table key length limit in YTsaurus, based on which the table is sorted.

yt.UseTmpfs

Value type By default Static/
dynamic
Flag false Dynamic

Connects tmpfs to the _yql_tmpfs folder in the sandbox with MapReduce jobs. Its use is not recommended.

yt.ExtraTmpfsSize

Value type By default Static/
dynamic
Bytes Dynamic

Ability to increase the size of tmpfs in addition to the total size of all expressly used files (specified in megabytes). It can be useful if you create new files in UDF locally. Without UseTmpfs is ignored.

yt.SchedulingTag / yt.SchedulingTagFilter

Value type By default Static/
dynamic
String Dynamic

Ability to enable "YTsaurus in clouds" by specifying external in a value, or to set any other valid value for this yt setting.

yt.PoolTrees

Value type By default Static/
dynamic
String containing a list of tree names separated by any of the following symbols: comma, semicolon, space, or ` `

yt.TentativePoolTrees

Value type By default Static/
dynamic
String containing a list of tree names separated by any of the following symbols: comma, semicolon, space, or ` `

Ability to "gently" spread operations over "pool trees" other than standard ones. To learn more, see documentation for YTsaurus.

yt.TentativeTreeEligibilitySampleJobCount

Value type By default Static/
dynamic
Positive number Dynamic

Effective only when the yt.TentativePoolTrees pragma is present. Sets the number of jobs in a sample. To learn more, see documentation for YTsaurus.

yt.TentativeTreeEligibilityMaxJobDurationRatio

Value type By default Static/
dynamic
Floating-point number Dynamic

Effective only when the yt.TentativePoolTrees pragma is present. Sets the permissible job slowdown factor in an alternative "pool tree". To learn more, see documentation for YTsaurus.

yt.TentativeTreeEligibilityMinJobDuration

Value type By default Static/
dynamic
Milliseconds Dynamic

Effective only when the yt.TentativePoolTrees pragma is present. Sets the minimum average job duration in an alternative "pool tree". To learn more, see documentation for YTsaurus.

yt.UseDefaultTentativePoolTrees

Value type By default Static/
dynamic
Flag Dynamic

Sets the value for use_default_tentative_pool_trees option in the operation spec.

yt.QueryCacheMode

Value type By default Static/
dynamic
String: disable / readonly / refresh / normal normal Static

Cache operates at the level of MapReduce operations:

  • Cache is disabled in disable mode.
  • readonly — read permissions only. No writes allowed.
  • refresh — write-only. No reads are allowed. A query error is generated if an error occurs during parallel write to the cache from another transaction.
  • normal — read and write permissions. If an error occurs during parallel write to the cache from another transaction, assume that the same data was written and continue your work.
    In normal and refresh mode, the output for each operation is additionally stored in //<tmp_folder>/query_cache/<hash>, where:
  • tmp_folder — defaults to tmp/<login> or PRAGMA yt.TmpFolder value;
  • hash — hash of input tables' meaningful metadata and the logical program that ran in the operation.
    In normal and readonly mode, this path is calculated for the MapReduce operation just before its launch. Depending on the selected caching mode, the operation may either be launched or instantly marked as successful using the prepared table instead of its outcome. If an expression contains nondeterministic functions like Random/RandomNumber/RandomUuid/CurrentUtcDate/CurrentUtcDatetime/CurrentUtcTimestamp, the cache for this operation is disabled. All UDFs are currently considered deterministic, meaning they don't interfere with caching. If a non-deterministic UDF must be used, you should specify an additional Uint64-type argument and pass CurentUtcTimestamp() to it. Use of arguments is not mandatory in this case.

yt.QueryCacheIgnoreTableRevision

Value type By default Static/
dynamic
Flag false Static

If the flag is set, YTsaurus revision is excluded from metadata during hash calculation. Therefore, QueryCache is not invalidated when modifying input table contents.
The mode is primarily intended for speeding up the complex queries debugging process in large, modifiable tables where query logic can't ignore these modifications.

yt.QueryCacheSalt

Value type By default Static/
dynamic
Random string Static

Salt to be mixed into the hash values calculation process for the query cache

yt.QueryCacheTtl

Value type By default Static/
dynamic
Time interval supporting s/m/h/d suffixes that counts from table creation time in the query cache or from the time of last table use. 7d Static

Allows management of TTL for tables created by the operation in the query cache.

yt.AutoMerge / yt.TemporaryAutoMerge / yt.PublishedAutoMerge

Value type By default Static/
dynamic
String: relaxed/economy/disabled relaxed Dynamic

yt.PublishedAutoMerge is only valid for merge inside YtPublish node (if it's launched there). yt.AutoMerge sets the value for this setting simultaneously for all YTsaurus query operations.

yt.ScriptCpu

Value type By default Static/
dynamic
Floating point number, minimum 1.0 1.0 Dynamic

Multiplier for evaluating utilization of the script UDF processor (including Python UDF and JavaScript UDF). Affects splitting of MapReduce operations to jobs. May be re-defined with special-purpose yt.PythonCpu / yt.JavascriptCpu pragmas for a specific UDF type.

yt.PythonCpu / yt.JavascriptCpu

Value type By default Static/
dynamic
Floating point number, minimum 1.0 4.0 Dynamic

Multiplier for evaluating utilization of the Python UDF and JavaScript UDF processor, respectively. Affects splitting of MapReduce operations to jobs.

yt.ErasureCodecCpu

Value type By default Static/
dynamic
Floating point number, minimum 1.0 5.0 Dynamic

Multiplier for evaluating utilization of the processor used for processing tables compressed with the erasure codec. Affects splitting of MapReduce operations to jobs.

yt.ReleaseTempData

Value type By default Static/
dynamic
String: immediate/finish/never immediate Static

Allows management of the removal time of temporary objects (e.g. tables) created when running the query:

  • immediate — remove objects as soon as they're no longer required.
  • finish — remove after running the entire YQL query.
  • never — never remove.

yt.CoreDumpPath

Value type By default Static/
dynamic
Path on cluster Static

Allows the coredump of dropped jobs for MapReduce operations to be saved to a separate table.

yt.MaxInputTables

Value type By default Static/
dynamic
Positive number 1000 Static

Limit of the number of delivered input tables for each specific MapReduce operation.

yt.MaxInputTablesForSortedMerge

Value type By default Static/
dynamic
Positive number 100 Static

Limit of the number of delivered input tables for a sorted merge operation.

yt.MaxOutputTables

Value type By default Static/
dynamic
Number from 1 to 100 50 Static

Limit of the number of output tables for each specific MapReduce operation.

yt.JoinCollectColumnarStatistics

Value type By default Static/
dynamic
String: disable/sync/async async Static

Manages the use of columnar statistics in order to precisely evaluate JOIN inputs and select the optimal strategy. Async includes the asynchronous columnar statistics collection mode.

yt.JoinColumnarStatisticsFetcherMode

Value type By default Static/
dynamic
String: from_nodes/from_master/fallback fallback Static

Manages the columnar statistics query mode in order to precisely evaluate JOIN inputs from YTsaurus. From_nodesmode ensures precise evaluation but may fail to fit timeouts for large tables. From_master mode works very fast but provides simplified statistics. Fallback mode works as a combination of the previous two.

yt.MapJoinLimit

Value type By default Static/
dynamic
Bytes 2,048M Static

Limit of a smaller table in JOIN, which ensures the Map-side strategy (creating a dictionary in the memory based on the smaller table and using it in the Map for a larger one).

You can disable the strategy completely by specifying 0 as the value.

yt.MapJoinShardCount

Value type By default Static/
dynamic
A number from 1 to 10 4 Static

Map-side JOIN strategy may run in a sharded manner. The smaller side is split into N shards (where N is less than or equal to the value of this PRAGMA), and all shards are independently and simultaneously joined with the larger side. Thus, concatenation of JOIN with shards is considered to be the outcome of JOIN.

yt.MapJoinShardMinRows

Value type By default Static/
dynamic
Positive number 1 Static

Minimum number of writes to the shard in map-side JOIN strategy.

yt.JoinMergeTablesLimit

Value type By default Static/
dynamic
Positive number 64 Static

Total permissible number of tables in the left and right sides for enabling Ordered JOIN strategy.

You can disable the strategy completely by specifying 0 as the value.

yt.JoinMergeUseSmallAsPrimary

Value type By default Static/
dynamic
Flag - Static

Explicit management in selecting the primary table in a Reduce operation with the Ordered JOIN strategy. If the value is set as True, then the smaller side will always be selected as the primary table. If the flag value is False, the larger side will be selected, except when unique keys are available on the larger side. Selecting a larger table as the primary table is safe, even if the table contains monster keys. However, it will run slower. If this pragma isn't set, the primary table is selected automatically based on the maximum size of the resulting jobs (see yt.JoinMergeReduceJobMaxSize).

yt.JoinMergeReduceJobMaxSize

Value type By default Static/
dynamic
Bytes 8G Static

Maximum acceptable size of Reduce job when selecting a small table as the primary table with the Ordered JOIN strategy. If the resulting size exceeds the specified value, the Reduce operation is repeated for the larger table as the primary table.

yt.JoinMergeUnsortedFactor

Value type By default Static/
dynamic
Positive floating point number 0.2 Static

Minimum ratio of an unsorted JOIN side to a sorted one for additional sorting of it and selection of the Ordered JOIN strategy.

yt.JoinMergeForce

Value type By default Static/
dynamic
Flag - Static

Forces selection of the Ordered JOIN strategy. If the flag is set to True, the Ordered JOIN strategy is selected even if a single JOIN side or both JOIN sides are unsorted. In this case, unsorted sides are pre-sorted. The maximum size of the unsorted table (see yt.JoinMergeUnsortedFactor) is unlimited in this case.

yt.JoinAllowColumnRenames

Value type By default Static/
dynamic
Flag true Static

Involves renaming columns when executing the Ordered JOIN strategy (rename_columns attribute is used). If the option is disabled, then the Ordered JOIN strategy is selected only when the left and right column names match.

yt.UseColumnarStatistics

Value type By default Static/
dynamic
String: disable/auto/force/0 (=disable)/1 (=force) force Dynamic

Includes the use of columnar statistics to precisely evaluate job sizes when launching operations on top of the tables containing columnar data selections. Auto mode automatically disables the use of statistics for operations that use input-containing tables with optimize_for=lookup

yt.MinPublishedAvgChunkSize

Value type By default Static/
dynamic
Bytes Static

If the average chunk size in the resulting output table is smaller than the specified setting, then an additional YTsaurus Merge operation is launched that enlarges the chunks to reach the specified size. The value of 0 has a special meaning. In this case, Merge is always launched to enlarge the chunks up to 1G.
If the table uses the compression codec, then the chunk output size may differ from the specified one by the compression factor value. This pragma sets the data size per merge job. The output size may be significantly smaller after compression. In this case, you should increase the pragma value by the expected compression factor value.

yt.MinTempAvgChunkSize

Value type By default Static/
dynamic
Bytes Static

The setting is similar to yt.MinPublishedAvgChunkSize, but it works for intermediate temporary tables.

yt.TableContentDeliveryMode

Value type By default Static/
dynamic
String: native/file native Dynamic

If the native value is set, then the table contents are delivered to jobs via native YTsaurus mechanisms. If the file value is used, the table contents are first downloaded from the YQL server and then delivered to jobs as a regular file.

yt.TableContentMaxChunksForNativeDelivery

Value type By default Static/
dynamic
Positive number, up to and including 1,000 1000 Static

Maximum number of chunks in the table in order to delivery it to jobs via native YTsaurus mechanisms. If this number is exceeded, the table is delivered via file

yt.TableContentCompressLevel

Value type By default Static/
dynamic
Positive number, up to and including 11 8 Dynamic

Setting the compression level for the table contents delivered via file (if yt.TableContentDeliveryMode="file")

yt.TableContentTmpFolder

Value type By default Static/
dynamic
Path on cluster Dynamic

Directory where temporary files for tables delivered via file (if yt.TableContentDeliveryMode="file") will be added. If not set, then the standard YTsaurus file cache is used

yt.TableContentMinAvgChunkSize

Value type By default Static/
dynamic
Bytes 1GB Static

Minimum average size of chunks in the table in order to delivery it to jobs via native YTsaurus mechanisms. If the chunk size is not large enough, then preliminary merge is inserted

yt.TableContentMaxInputTables

Value type By default Static/
dynamic
Positive number, up to and including 1,000 1000 Static

Maximum number of tables for delivery to jobs via native YTsaurus mechanisms. If this number is exceeded, then preliminary merge is inserted

yt.TableContentUseSkiff

Value type By default Static/
dynamic
Flag true Dynamic

Includes skiff format for delivering the table to operation jobs.

yt.LayerPaths

Value type By default Static/
dynamic
String containing the list of paths to porto layers separated by any of the following symbols: comma, semicolon, space, or ` `

Ability to specify the sequence of porto layers in order to make an environment where custom jobs will be performed.

yt.UseSkiff

Value type By default Static/
dynamic
Flag true Dynamic

Includes the skiff format for inputting and outputting in operation jobs.

yt.DefaultCalcMemoryLimit

Value type By default Static/
dynamic
Bytes 1G Static

Limit on memory utilization for calculations that aren't related to table access.

You can use K, M, and G suffixes to specify values in kilobytes, megabytes, and gigabytes, respectively.

yt.ParallelOperationsLimit

Value type By default Static/
dynamic
Number, minimum 1 16 Static

Sets the maximum number of parallel YTsaurus operations inside the query.

yt.DefaultCluster

Value type By default Static/
dynamic
String hahn Static

Sets the default cluster where calculations that aren't related to table access are performed.

yt.DefaultMemoryReserveFactor

Value type By default Static/
dynamic
Floating point number from 0.0 to 1.0 Dynamic

Sets the factor for memory reservation for jobs. See documentation for YTsaurus

yt.DefaultMemoryDigestLowerBound

Value type By default Static/
dynamic
Floating point number from 0.0 to 1.0 Dynamic

Sets the setting user_job_memory_digest_lower_bound in the operation spec (not documented in YTsaurus)

yt.BufferRowCount

Value type By default Static/
dynamic
Number, minimum 1 Dynamic

Limit on number of records that JobProxy can buffer. See documentation for YTsaurus

yt.DisableJobSplitting

Value type By default Static/
dynamic
Flag false Dynamic

Ban the YTsaurus scheduler from adaptively split long-running custom jobs

yt.DefaultLocalityTimeout

Value type By default Static/
dynamic
Time interval supporting s/m/h/d suffixes Dynamic

Sets the locality_timeout setting in the operation spec (not documented in YTsaurus)

yt.MapLocalityTimeout

Value type By default Static/
dynamic
Time interval supporting s/m/h/d suffixes Dynamic

Sets the map_locality_timeout setting in the operation spec (not documented in YTsaurus)

yt.ReduceLocalityTimeout

Value type By default Static/
dynamic
Time interval supporting s/m/h/d suffixes Dynamic

Sets the reduce_locality_timeout setting in the operation spec (not documented in YTsaurus)

yt.SortLocalityTimeout

Value type By default Static/
dynamic
Time interval supporting s/m/h/d suffixes Dynamic

Sets the sort_locality_timeout setting in the operation spec (not documented in YTsaurus)

yt.MinLocalityInputDataWeight

Value type By default Static/
dynamic
Bytes Dynamic

Sets the min_locality_input_data_weight setting in the operation spec (not documented in YTsaurus)

yt.DefaultMapSelectivityFactor

Value type By default Static/
dynamic
Positive floating point number Dynamic

Sets the approximate output-input ratio for the map stage in the joint MapReduce operation.

yt.SuspendIfAccountLimitExceeded

Value type By default Static/
dynamic
Flag false Dynamic

Pause the operation if the "Account limit exceeded" error occurs in jobs.

yt.CommonJoinCoreLimit

Value type By default Static/
dynamic
Bytes 128M Static

Sets the memory buffer size for CommonJoinCore node execution (executed in the job when the common JOIN strategy is selected)

yt.CombineCoreLimit

Value type By default Static/
dynamic
Bytes, minimum 1M 128M Static

Sets the memory buffer size for CombineCore node execution

yt.SwitchLimit

Value type By default Static/
dynamic
Bytes, minimum 1M 128M Static

Sets the memory buffer size for Switch node execution

yt.EvaluationTableSizeLimit

Value type By default Static/
dynamic
Bytes, maximum 10M 1M Static

Sets the maximum total volume of tables used at the evaluation step

yt.LookupJoinLimit

Value type By default Static/
dynamic
Bytes, maximum 10M 1M Static

A table may be used as a dictionary in the Lookup JOIN strategy if it doesn't exceed the minimum size specified inyt.LookupJoinLimit and yt.EvaluationTableSizeLimit.

yt.LookupJoinMaxRows

Value type By default Static/
dynamic
Number, maximum 1,000 900 Static

Maximum number of table rows at which the table may be used as a dictionary in the Lookup JOIN strategy.

yt.MaxExtraJobMemoryToFuseOperations

Value type By default Static/
dynamic
Bytes 2G Static

Maximum memory utilization for jobs permitted after operations are merged by optimizers

yt.MaxReplicationFactorToFuseOperations

Value type By default Static/
dynamic
Floating point number, minimum 1.0 20.0 Static

Maximum data reproduction factor permitted after operations are merged by optimizers

yt.TopSortMaxLimit

Value type By default Static/
dynamic
Positive number 1000 Static

Maximum LIMIT value used together with ORDER BY where TopSort optimization is launched

yt.TopSortSizePerJob

Value type By default Static/
dynamic
Bytes, minimum 1 128M Static

Sets the expected data volume per job in a TopSort operation

yt.TopSortRowMultiplierPerJob

Value type By default Static/
dynamic
Number, minimum 1 10 Static

Sets the expected number of records per job in a TopSort operation, calculated as LIMIT * yt.TopSortRowMultiplierPerJob

yt.DisableOptimizers

Value type By default Static/
dynamic
String containing the list of optimizers separated by any of the following symbols: comma, semicolon, space, or ` `

Disables the set optimizers

yt.JobEnv

Value type By default Static/
dynamic
String representation of a yson dictionary Dynamic

Sets environment variables for map and reduce jobs in operations. Keys in the dictionary set the environment variable names, and values in the dictionary set the values for these variables.

yt.OperationSpec

Value type By default Static/
dynamic
String representation of a yson dictionary Dynamic

Sets the operation settings dictionary. Lets you set the settings that have no counterparts in the form of pragmas. Settings that were set via special-purpose pragmas have priority and redefine the values in this dictionary

yt.Annotations

Value type By default Static/
dynamic
String representation of a yson dictionary Dynamic

Sets random structured information related to the operation. See documentation for YTsaurus

yt.GeobaseDownloadUrl

Value type By default Static/
dynamic
String Dynamic

Sets the URL for downloading the geobase (geodata6.bin file) if the query uses Geo UDF

yt.MaxSpeculativeJobCountPerTask

Value type By default Static/
dynamic
Positive number Dynamic

Sets the number of speculatively performed jobs in YTsaurus operations. YTsaurus cluster settings are used by default.

yt.LLVMMemSize

Value type By default Static/
dynamic
Bytes 256M Dynamic

Sets the fixed memory size required for compiling the LLVM code in jobs

yt.LLVMPerNodeMemSize

Value type By default Static/
dynamic
Bytes 10K Dynamic

Sets the required memory size per calculation graph node for compiling the LLVM code in jobs

yt.SamplingIoBlockSize

Value type By default Static/
dynamic
Bytes Dynamic

Sets the minimum size of a block for coarse-grain sampling.

yt.BinaryTmpFolder

Value type By default Static/
dynamic
Path on cluster Static

Sets a separate path on the cluster, where binary query artifacts will be cached (UDF and job binary). Artifacts are saved to a directory root with the same name as the artifact's md5. Artifacts are saved and used in this directory outside of transactions even if a yt.ExternalTx pragma is set in the query.

yt.BinaryExpirationInterval

Value type By default Static/
dynamic
Time interval supporting s/m/h/d suffixes Static

Allows management of TTL for cached binary artifacts. Only works together with yt.BinaryTmpFolder. Each use of a binary artifact in the query extends the lifetime of its TTL.

yt.FolderInlineDataLimit

Value type By default Static/
dynamic
Bytes 100K Static

Sets the maximum amount of data for the inline list obtained as a result of the Folder calculation. If a greater size is selected, a temporary file will be used.

yt.FolderInlineItemsLimit

Value type By default Static/
dynamic
Positive number 100 Static

Sets the maximum number of elements in the inline list obtained as a result of the Folder calculation. If a greater size is selected, a temporary file will be used.

yt.UseNativeYtTypes

Value type By default Static/
dynamic
Flag false Static

Allows complex-type values to be recorded in tables through native support of complex types in YTsaurus

yt.PublishedMedia / yt.TemporaryMedia

Value type By default Static/
dynamic
String representation of a yson dictionary Dynamic

Set the @media attribute for newly created tables. If available, assigns mediums in YTsaurus, where table chunks will be stored.

Tables specified in INSERT INTO are Published. All other tables are Temporary.

yt.PublishedPrimaryMedium / yt.TemporaryPrimaryMedium

Value type By default Static/
dynamic
String Dynamic

Set the @primary_medium attribute for newly created tables. If available, assigns the primary medium in YTsaurus, where chunks will be recorded. By default, YTsaurus sets the primary medium to "default".

Tables specified in INSERT INTO are Published. All other tables are Temporary.

yt.IntermediateDataMedium

Value type By default Static/
dynamic
String Dynamic

Set the medium used for intermediate data in operations (Sort, MapReduce). To learn more, see documentation for YTsaurus.

yt.PrimaryMedium

Value type By default Static/
dynamic
String Dynamic

Sets the primary medium inYTsaurus for Published and Temporary tables and intermediate data in operations. Amounts to simultaneous setting of yt.IntermediateDataMedium, yt.PublishedPrimaryMedium, and yt.TemporaryPrimaryMedium pragmas.

yt.HybridDqExecution

Value type By default Static/
dynamic
Flag true Static

Includes hybrid query execution via DQ

yt.NetworkProject

Value type By default Static/
dynamic
String - Dynamic

Sets the use of a specified network project in jobs.

Previous
In this article: