Debugging MapReduce programs

Emulating job startup locally

You can emulate job startup on any computer. For example, you can use your development server. To learn more about local job emulation, see Debugging jobs locally and Slow job troubleshooting.

Getting the current stderr of a running job

As an operation runs, there may be situations where there's still one incomplete job left while the rest have already finished. In YTsaurus, it's possible to get the stderr that the job has written up to a given moment.

You can do that with a special mode implemented in the yt command: yt get-job-stderr.

yt get-job-stderr --operation-id 35060d89-f9328e09-3f403e8-6f4eb4b5 --job-id fe270b54-a938652-3fc0384-2144

Stderr in the output is truncated according to the same rules that apply when you save files in Cypress: the first few megabytes plus the last few megabytes are shaved off.

Getting full stderr of all jobs of an operation

In YTsaurus, you can save full stderr of all jobs to a table. You can export stderr of those jobs that were not aborted.

To enable the described behavior:

Use the stderr_table parameter. For example:

yt.wrapper.run_map_reduce( mapper, reducer, '//path/to/input', '//path/to/output', reduce_by=['some_key'], stderr_table='//path/to/stderr/table', )

Use the StderrTablePath setting.

You can pass the stderr_table_path setting directly to the operation specification. For a description of this option, see Operation options.

A stderr table has the following columns:

  1. job_id: Job ID.
  2. part_index: If a job has stderr that is too large, it is split into parts. The part_index value indicates the index of a specific stderr part.
  3. data: Stderr data itself.

The table is sorted by job_id, part_index.

You can read such a table using the read_blob_table command.