Crash Diagnostic Layer

The Crash Diagnostic Layer (CDL) is a Vulkan layer to help track down and identify the cause of GPU hangs and crashes. It works by instrumenting command buffers with completion checkpoints. When an error is detected a dump file containing incomplete command buffers is written. Often the last complete or incomplete commands are responsible for the crash.

Building

See the associated BUILD.md file for details on how to build the Crash Diagnostic Layer for various platforms.

Runtime requirements

CDL uses the following extensions. If an extension is not present, some functionality might be disabled.

Running

CDL can be used as an explicit or implicit layer. The loader's documentation describes the difference between implicit and explicit layers, but the relevant bit here is that implicit layers are meant to be available to all applications on the system, even if the application doesn't explicitly enable the layer. On the other hand, explicit layers are easier to use when doing application development.

Explicit Layer

To use CDL as an explicit layer, enable it with vkconfig or do the following:

  1. The directory containing the file VkLayer_crash_diagnostic.json is included in the layer search path, by including it in either the VK_LAYER_PATH or VK_ADD_LAYER_PATH environment variable or by using vkconfig.
  2. Include the layer name in the VK_INSTANCE_LAYERS environment variable or the ppEnabledLayerNames field of VkInstanceCreateInfo.

Implicit Layer

When using CDL as an implicit layer, it is disabled by default. To enable the layer's functionality, set the CDL_ENABLE environment variable to 1.

Registering the Layer

In order to be discovered by the Vulkan loader at runtime, implicit layers must be registered. The registration process is platform-specific and is discussed in detail in the Vulkan-Loader documentation. In all cases, it is the layer manifest (the .json file) that is registered; the manifest contains a relative path to the layer library, which can be in a separate directory.

Basic Usage

Once the layer is enabled, if vkQueueSubmit() or other Vulkan functions returns a fatal error code (usually VK_ERROR_DEVICE_LOST), a dump file of the command buffers (and other state) that failed to execute are written to disk.

Logging

CDL outputs messages at runtime to indicate it is enabled, to trace certain commands, and to report when a gpu crash has occurred. The message_severity, log_file, trace_on and trace_all_semaphores configuration settings control which messages are output and where they are sent.

VK_EXT_debug_utils and VK_EXT_debug_report extensions can also be used to receive log messages directly in the application.

Dump files

A unique log file directory is created every time an application is run with CDL enabled. The log file directories are named with the timestamp of the format YYYY-MM-DD-HHMMSS. This prevents subsequent runs from overwriting old data, in case the application is non-deterministic and producing different crashes on each run. The location of these files is platform-specific:

The output directory for dump files can be changed using the output_path configuration setting, described below.

The name of the dump file is cdl_dump.yaml, since the file is in the YAML format. Other files, such as dumped shaders, may be present in the dump directory.

Configuration

This layer implements the VK_EXT_layer_settings extension, so it can be configured with vkconfig, programmatically, or with environment variables. See the manifest for this layer and the layer manifest schema for full details. The discussion below uses the key field to identify each setting, the env field defines the corresponding environment variable, and the label field defines the text you will see when using vkconfig.

Layer Details

Layer Properties

Layer Settings Overview

Setting Type Default Value vk_layer_settings.txt Variable Environment Variable Platforms
Watchdog timeout (ms) INT 30000 lunarg_crash_diagnostic.watchdog_timeout_ms CDL_WATCHDOG_TIMEOUT_MS WINDOWS, LINUX, MACOS, ANDROID
Output Path STRING lunarg_crash_diagnostic.output_path CDL_OUTPUT_PATH WINDOWS, LINUX, MACOS, ANDROID
Dump queue submissions ENUM running lunarg_crash_diagnostic.dump_queue_submits CDL_DUMP_QUEUE_SUBMITS WINDOWS, LINUX, MACOS, ANDROID
Dump command buffers ENUM running lunarg_crash_diagnostic.dump_command_buffers CDL_DUMP_COMMAND_BUFFERS WINDOWS, LINUX, MACOS, ANDROID
Dump commands ENUM running lunarg_crash_diagnostic.dump_commands CDL_DUMP_COMMANDS WINDOWS, LINUX, MACOS, ANDROID
Dump shaders ENUM off lunarg_crash_diagnostic.dump_shaders CDL_DUMP_SHADERS WINDOWS, LINUX, MACOS, ANDROID
Message Severity FLAGS error lunarg_crash_diagnostic.message_severity VK_LUNARG_CRASH_DIAGNOSTIC_MESSAGE_SEVERITY WINDOWS, LINUX, MACOS, ANDROID
Log file name STRING stderr lunarg_crash_diagnostic.log_file CDL_LOG_FILE WINDOWS, LINUX, MACOS
Enable Tracing BOOL false lunarg_crash_diagnostic.trace_on CDL_TRACE_ON WINDOWS, LINUX, MACOS, ANDROID
Enable semaphore log tracing. BOOL false lunarg_crash_diagnostic.trace_all_semaphores CDL_TRACE_ALL_SEMAPHORES WINDOWS, LINUX, MACOS, ANDROID
Synchronize commands BOOL false lunarg_crash_diagnostic.sync_after_commands CDL_SYNC_AFTER_COMMANDS WINDOWS, LINUX, MACOS, ANDROID
Instrument all commands BOOL false lunarg_crash_diagnostic.instrument_all_commands CDL_INSTRUMENT_ALL_COMMANDS WINDOWS, LINUX, MACOS, ANDROID
Track semaphores BOOL true lunarg_crash_diagnostic.track_semaphores CDL_TRACK_SEMAPHORES WINDOWS, LINUX, MACOS, ANDROID

Layer Settings Details

Watchdog timeout (ms)(ALPHA)

If set to a non-zero number, a watchdog thread will be created. This will trigger if the application fails to submit new commands within a set time (in milliseconds) and a log will be created as if the a lost device error was encountered.

Setting Properties:

Output Path(ALPHA)

The directory where dump files and shader binaries are written.

Setting Properties:

Dump queue submissions(ALPHA)

Control which queue submissions are included in the dump.

Setting Properties:
Enum Value Label Description Platforms
running Running Submissions that were executing at the time of the dump. WINDOWS, LINUX, MACOS, ANDROID
pending Pending Submissions executing or pending at the time of the dump. WINDOWS, LINUX, MACOS, ANDROID

Dump command buffers(ALPHA)

Control which command buffers are included in the dump.

Setting Properties:
Enum Value Label Description Platforms
running Running Command Buffers that were executing at the time of the dump. WINDOWS, LINUX, MACOS, ANDROID
pending Pending Command Buffers executing or pending at the time of the dump. WINDOWS, LINUX, MACOS, ANDROID
all All All known Command Buffers. WINDOWS, LINUX, MACOS, ANDROID

Dump commands(ALPHA)

Control which commands are included in the dump.

Setting Properties:
Enum Value Label Description Platforms
running Running Command Buffers that were executing at the time of the dump. WINDOWS, LINUX, MACOS, ANDROID
pending Pending Commands executing or pending at the time of the dump. WINDOWS, LINUX, MACOS, ANDROID
all All All known Commands. WINDOWS, LINUX, MACOS, ANDROID

Dump shaders(ALPHA)

Control of shader dumping.

Setting Properties:
Enum Value Label Description Platforms
off Off Never dump shaders. WINDOWS, LINUX, MACOS, ANDROID
on_crash On Crash Dump currently bound shaders after a crash is detected. WINDOWS, LINUX, MACOS, ANDROID
on_bind On Bind Dump only bound shaders. WINDOWS, LINUX, MACOS, ANDROID
all All Dump all shaders. WINDOWS, LINUX, MACOS, ANDROID

Message Severity(ALPHA)

Comma-delineated list of options specifying the types of log messages to be reported

Setting Properties:
Flags Label Description Platforms
error Error Report errors such as device lost or setup problems in the layer. WINDOWS, LINUX, MACOS, ANDROID
warn Warning Report non-fatal problems that may interfere with operation of the layer WINDOWS, LINUX, MACOS, ANDROID
info Info Report informational messages. WINDOWS, LINUX, MACOS, ANDROID
verbose Verbose For layer development. Report messages for debugging layer behavior. WINDOWS, LINUX, MACOS, ANDROID

Log file name(ALPHA)

none = no logging, stderr or stdout = to the console, otherwise an absolute or relative path

Setting Properties:

Enable Tracing(ALPHA)

All Vulkan API calls intercepted by the layer will be logged to the console.

Setting Properties:

Enable semaphore log tracing.(ALPHA)

Semaphore events will be logged to console.

Setting Properties:

Synchronize commands(ALPHA)

Add pipeline barriers after instrumented commands.

Setting Properties:

Instrument all commands(ALPHA)

All commands will be instrumented.

Setting Properties:

Track semaphores(ALPHA)

Enable semaphore tracking.

Setting Properties: