Subquery templates
- DEFINE SUBQUERY
- Merging subquery templates SubqueryExtend, SubqueryUnionAll, SubqueryMerge, SubqueryUnionMerge
- Merging subquery templates after substituting a list element from SubqueryExtendFor, SubqueryUnionAllFor, SubqueryMergeFor, or SubqueryUnionMergeFor
- Adding sorting to the SubqueryOrderBy template or indicating the presence of this SubqueryAssumeOrderBy
DEFINE SUBQUERY
DEFINE SUBQUERY
lets you declare a subquery template as a parameterizable block consisting of several upper-tier statements and then use it repeatedly through application in the FROM
section of the SELECT statement or in input data in PROCESS/REDUCE, with parameters indicated.
Unlike actions, a subquery template must end with the SELECT
/PROCESS
/REDUCE
statement. Its result is the returnable subquery value. Here, the upper-tier SELECT
/PROCESS
/REDUCE
statement cannot be used more than once, similarly to modifying statements (for instance, INSERT
).
After DEFINE SUBQUERY
, specify:
- A named expression that will be used to access the declared template further in the query.
- The round brackets contain a list of named expressions you can use to access parameters inside the subquery template.
AS
keyword.- The list of top-level expressions.
END DEFINE
acts as a marker of the last expression inside the subquery template.
One or more of the last subquery parameters can be marked with a question as optional: unless they haven't been specified when calling subquery, they will be assigned the NULL
value.
Note
In large queries, you can use separate files for action definition and include them to the main query using EXPORT + IMPORT so that instead of one long text you can have several logical parts that are easier to navigate. An important nuance: the USE my_cluster
; directive in the import query doesn't affect behavior of actions declared in other subquery files.
Even if the list of parameters in the subquery template definition is empty, when using it in FROM
, specify the parentheses ()
. This may help to limit the scope of named expressions used in only one subquery.
In certain cases, instead of DEFINE SUBQUERY
, it's better to use an equivalent form appearing as a lambda function.
In this case, the lambda function must accept, as the first argument, the special object called world
that passes dependencies to make certain PRAGMA or COMMIT statements visible at the query template's point of use. Also, make sure to pass this object as the first argument along with the other arguments (if any) to other query templates, if you use them in your lambda function.
The return value of the lambda function must have the structure list type (output table) or a list of variants over a tuple of structures (multiple output tables). In the latter case, the following unpacking is usually used at the query template's point of use:
$out1, $out2 = PROCESS $mySubquery($myParam1, $myParam2);
-- $out1 and $out2 are further used as individual tables.
Examples
DEFINE SUBQUERY $hello_world($name, $suffix?) AS
$name = $name ?? ($suffix ?? "world");
SELECT "Hello, " || $name || "!";
END DEFINE;
SELECT * FROM $hello_world(NULL); -- Hello, world!
SELECT * FROM $hello_world("John"); -- Hello, John!
SELECT * FROM $hello_world(NULL, "Earth"); -- Hello, Earth!
DEFINE SUBQUERY $dup($x) AS
SELECT * FROM $x(1) -- apply transmitted query template with one argument
UNION ALL
SELECT * FROM $x(2); -- ... and with another argument
END DEFINE;
DEFINE SUBQUERY $sub($n) AS
SELECT $n * 10;
END DEFINE;
SELECT * FROM $dup($sub); -- transmit $sub query template as a parameter
-- Result:
-- 10
-- 20
/* Move the utilized named expressions $a and $b to a separate area of visibility*/
DEFINE SUBQUERY $clean() AS
$a = 10;
$b = $a * $a;
SELECT $a AS a, $b AS b;
END DEFINE;
SELECT * FROM $clean(); -- a: 10, b: 100
USE hahn;
DEFINE SUBQUERY $input() as
SELECT * FROM `home/yql/tutorial/users`;
END DEFINE;
DEFINE SUBQUERY $myProcess1($nestedQuery, $lambda) AS
PROCESS $nestedQuery() -- parentheses () must be used here
USING $lambda(TableRow());
END DEFINE;
$myProcess2 = ($world, $nestedQuery, $lambda) -> {
-- If ListFlatMap or YQL::OrderedFlatMap is used, this results in Ordered YTsaurus Map operation
return YQL::FlatMap($nestedQuery($world), $lambda);
};
-- With such use, $myProcess1 and $myProcess2 implementations are identical
SELECT * FROM $myProcess1($input, ($x) -> { RETURN AsList($x, $x) });
SELECT * FROM $myProcess2($input, ($x) -> { RETURN AsList($x, $x) });
USE hahn;
DEFINE SUBQUERY $runPartition($table) AS
$paritionByAge = ($row) -> {
$recordType = TypeOf($row);
$varType = VariantType(TupleType($recordType, $recordType));
RETURN If($row.age % 2 == 0,
Variant($row, "0", $varType),
Variant($row, "1", $varType),
);
};
PROCESS $table USING $paritionByAge(TableRow());
END DEFINE;
-- Unpacking two results
$i, $j = (PROCESS $runPartition("home/yql/tutorial/users"));
SELECT * FROM $i;
SELECT * FROM $j;
Merging subquery templates SubqueryExtend, SubqueryUnionAll, SubqueryMerge, SubqueryUnionMerge
These functions combine the results of one or more subquery templates passed by arguments. The number of parameters in such subquery templates must be the same.
SubqueryExtend
requires matching of subquery schemas.SubqueryUnionAll
follows the same rules asListUnionAll
.SubqueryMerge
uses the same constraints asSubqueryExtend
and also outputs a sorted result if all subqueries have the same sort order.SubqueryUnionMerge
uses the same constraints asSubqueryUnionAll
and also outputs a sorted result if all subqueries have the same sort order.
Examples:
DEFINE SUBQUERY $sub1() as
SELECT 1 as x;
END DEFINE;
DEFINE SUBQUERY $sub2() as
SELECT 2 as x;
END DEFINE;
$s = SubqueryExtend($sub1,$sub2);
PROCESS $s();
Merging subquery templates after substituting a list element from SubqueryExtendFor, SubqueryUnionAllFor, SubqueryMergeFor, or SubqueryUnionMergeFor
The functions take the following arguments:
- A non-empty list of values.
- A subquery template that must have exactly one parameter.
They substitute each item from the list into the subquery template as a parameter and then combine the obtained subqueries.
SubqueryExtendFor
requires matching of subquery schemas.SubqueryUnionAllFor
follows the same rules asListUnionAll
.SubqueryMergeFor
uses the same constraints asSubqueryExtendFor
and also outputs a sorted result if all subqueries have the same sort order.SubqueryUnionMergeFor
uses the same constraints asSubqueryUnionAllFor
and also outputs a sorted result if all subqueries have the same sort order.
Examples:
DEFINE SUBQUERY $sub($i) as
SELECT $i as x;
END DEFINE;
$s = SubqueryExtendFor([1,2,3],$sub);
PROCESS $s();
Adding sorting to the SubqueryOrderBy template or indicating the presence of this SubqueryAssumeOrderBy
The functions take the following arguments:
- A subquery template without parameters.
- A list of pairs (string indicating the column name and Boolean value: True for sorting in ascending order or False for sorting in descending order).
And they build a new query template without parameters where sorting is performed or a comment on the use of sorting is added to the result. To use the resulting query template, call the PROCESS
function, since, when using a SELECT
, sorting is ignored.
Examples:
DEFINE SUBQUERY $sub() as
SELECT * FROM (VALUES (1,'c'), (1,'a'), (3,'b')) AS a(x,y);
end define;
$sub2 = SubqueryOrderBy($sub, [('x',false), ('y',true)]);
PROCESS $sub2();