Code file for this section is 10-init_render_pass.cpp
A render pass describes the scope of a rendering operation by specifying the collection of attachments, subpasses, and dependencies used during the rendering operation. A render pass consists of at least one subpass. The communication of this information to the driver allows the driver to know what to expect when rendering begins and to set up the hardware optimally for the rendering operation.
You begin by using vkCreateRenderPass()
to define the render pass,
and then later insert a render pass instance into a command buffer
using vkCmdBeginRenderPass()
and vkCmdEndRenderPass()
.
In this section, you will only be creating and defining the render pass and not using it in a command buffer until later.
Examples of render pass attachments in the context of this sample include the color attachment, which is the image from the swapchain, and the depth/stencil attachment, which is the depth buffer you allocated in a previous sample.
Image attachments must be prepared for use when they are used as attachments in a render pass instance that is executed in a command buffer. This preparation involves transitioning image layouts from their initial undefined states to states that are optimal for their use in the render pass. Since these layout transitions are rather involved, you will learn about them here, before continuing on with creating the render pass.
The layout of an image refers to how image texels are mapped from a grid coordinate representation to an offset in the image memory. Typically, image data is mapped in this way linearly, which for a 2D image might mean that a row of texels is stored in contiguous memory, with the next row stored contiguously after that row, and so on.
Put another way:
offset = rowCoord * pitch + colCoord
where pitch is the size of a row. The pitch is often the same as the width of the image, but may include some additional bytes of padding to ensure that each row of the image begins at a memory address that meets the alignment requirements of the GPU.
A linear layout is fine for successive texel read or writes along a
single row, by changing the colCoord
.
But most graphics operations involve accessing texels across
multiple adjacent rows as well, by changing the rowCoord
.
If the image width is fairly wide, then these adjacent row accesses
introduce rather large jumps across the linear memory address space.
This can cause performance problems such as slower memory address
translation due to TLB misses and cache misses in multilevel
cache memory systems.
To combat these inefficiencies, many GPU hardware implementations support an "optimal" or tiled memory access scheme. In an optimal layout, a rectangle of texels appearing in the middle of an image is stored in memory such that all the texels are in one continuous stretch of memory. For example, the texels that make up a rectangle with the upper-left corner at [16,32] and lower-right corner at [31,47] might see this 16 x 16 block of texel stored contiguously starting at one address. There are no long gaps between rows.
If the GPU wanted to fill this block, with a solid color for example, it is able to write to this 256 texel block with a low amount of memory system overhead.
Here's an example of a simple 2x2 tiling scheme. Note that the blue texels can be far apart from each other in the linear scheme, while they are adjacent in the tiled scheme.
Most implementations use a more complex tiling pattern and tile sizes that are larger than 2x2.
GPU hardware often favors optimal layouts for more efficient rendering, as explained just above. Optimal layouts are often "opaque", which means that the details of the optimal layout format are not published or made known to other components that need to read or write image data.
For example, you may want the GPU to render to an image using an optimal layout. But if you wish to copy the resulting image to a buffer that you can read and understand using the CPU, you change the layout from optimal to general before trying to read it.
Conversions from one layout to another are called layout transitions in Vulkan. You have control over these transitions and can invoke them in one of three ways:
vkCmdPipelineBarrier
)The memory barrier command is an explicit layout transition command that you place in a command buffer. You would use this command to synchronize memory accesses in more complex situations, for example. Since you will be using the other two methods to perform layout transitions here, you won't be using this barrier command in this tutorial.
The more common situation is the need to perform layout transitions on images used for rendering before rendering begins and after rendering is complete. The first transition prepares the image for rendering by the GPU and the last one prepares the image for presentation to the display. In these cases, you specify the layout transitions as part of the render pass definition. You will see how to do this later in this section.
A layout transition may or may not trigger an actual GPU layout conversion operation. For example, if the old layout is undefined and the new layout is optimal, the GPU would have to do no work other than perhaps program the GPU hardware to access memory in the optimal pattern. This is because the contents of the image is undefined and does not need to be preserved by converting it. On the other hand, if the old layout was general (non-optimal) and there is an indication that the data in the image needs to be preserved, the transition to optimal probably involves some work by the GPU to shuffle the texels around.
Even if you know or think that a layout transition won't actually do any work, it is always best practice to do them anyway, since it gives the driver more information and helps ensure that your application works on more devices.
The sample code uses subpass definitions and render pass definitions to specify the required image layout transitions, instead of using memory barrier commands.
The initial render pass layout is undefined and means don't care because when the render pass starts, you don't care what is already in the image because you are going to draw over it anyway. Here, you are just telling the driver what layout is in effect for the image at the start of the render pass. The driver won't do any transitions until the subpass and/or until the end of the render pass.
The subpass layout is set to optimal for the color buffer, which indicates that the driver should transition the layout to optimal for the duration of the rendering operations that occur during the subpass. The sample code sets a similar setting for the depth buffer.
The final render pass layout of present tells the driver to transition the layout to the best layout for presenting to the display.
Now that you know how to get the image layouts into the correct state, you can proceed with defining the rest of the render pass.
There are two attachments, one for color and one for depth:
VkAttachmentDescription attachments[2];
attachments[0].format = info.format;
attachments[0].samples = NUM_SAMPLES;
attachments[0].loadOp = VK_ATTACHMENT_LOAD_OP_CLEAR;
attachments[0].storeOp = VK_ATTACHMENT_STORE_OP_STORE;
attachments[0].stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE;
attachments[0].stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE;
attachments[0].initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
attachments[0].finalLayout = VK_IMAGE_LAYOUT_PRESENT_SRC_KHR;
attachments[0].flags = 0;
attachments[1].format = info.depth.format;
attachments[1].samples = NUM_SAMPLES;
attachments[1].loadOp = VK_ATTACHMENT_LOAD_OP_CLEAR
attachments[1].storeOp = VK_ATTACHMENT_STORE_OP_DONT_CARE;
attachments[1].stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE;
attachments[1].stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE;
attachments[1].initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
attachments[1].finalLayout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL;
attachments[1].flags = 0;
Setting the loadOp
member to CLEAR in both attachments indicates that you want the buffer to be
cleared at the start of the render pass instance.
Setting the storeOp
member to STORE for the color attachment means that you
want to leave the rendering result in this buffer, so it can be presented to
the display.
Setting the storeOp
member to DONT_CARE for the depth attachment means that
you don't need the contents of the buffer when the render pass instance is complete.
Telling the driver that you don't care about the contents of a buffer after it is used
can be useful because it allows the driver to
discard or page out that memory if it needed to without saving the contents.
For image layouts, you specify both the color and depth buffers to start with undefined layouts, as discussed earlier.
The subpass, which occurs between the initial and final layouts,
takes the color attachment to VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL
and the depth attachment to VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL
.
For the color attachment, you specify the final layout to be the
VK_IMAGE_LAYOUT_PRESENT_SRC_KHR
layout, which is the one that is
appropriate for the present operation that occurs after the render pass is complete.
You can leave the depth layout the same as the layout set by the subpass
since the depth buffer is not used as part of the present operation.
The subpass definition is straightforward and would be more interesting if you were doing multiple subpasses. And you might be interested in doing multiple subpasses if you are doing some pre-processing or post-processing of your graphics data, perhaps for ambient occlusion or some other effect. But here, the subpass definition is useful for indicating which attachments are active during the subpass and also for specifying the layouts to be used while rendering during the subpass.
VkAttachmentReference color_reference = {};
color_reference.attachment = 0;
color_reference.layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
VkAttachmentReference depth_reference = {};
depth_reference.attachment = 1;
depth_reference.layout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL;
The attachment
member is the index of the attachment in the
attachments array you just defined above for the render pass.
VkSubpassDescription subpass = {};
subpass.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS;
subpass.flags = 0;
subpass.inputAttachmentCount = 0;
subpass.pInputAttachments = NULL;
subpass.colorAttachmentCount = 1;
subpass.pColorAttachments = &color_reference;
subpass.pResolveAttachments = NULL;
subpass.pDepthStencilAttachment = &depth_reference;
subpass.preserveAttachmentCount = 0;
subpass.pPreserveAttachments = NULL;
The pipelineBindPoint
member is meant to indicate if this is a graphics
or a compute subpass.
Currently, only the graphics subpass is valid.
We are asking the renderpass to transition our image layout from UNDEFINED to COLORATTACHMENTOPTIMAL. We will be using a semaphore that signals at the COLORATTACHMENTOUTPUT stage to indicate that the present engine has made the swapchain image available, so we want to create a dependency on that stage before beginning our subpass.
VkSubpassDependency subpass_dependency = {};
subpass_dependency.srcSubpass = VK_SUBPASS_EXTERNAL;
subpass_dependency.dstSubpass = 0;
subpass_dependency.srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
subpass_dependency.dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
subpass_dependency.srcAccessMask = 0;
subpass_dependency.dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
subpass_dependency.dependencyFlags = 0;
Now you have all you need to define the render pass:
VkRenderPassCreateInfo rp_info = {};
rp_info.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO;
rp_info.pNext = NULL;
rp_info.attachmentCount = 2;
rp_info.pAttachments = attachments;
rp_info.subpassCount = 1;
rp_info.pSubpasses = &subpass;
rp_info.dependencyCount = 1;
rp_info.pDependencies = &subpass_dependency;
res = vkCreateRenderPass(info.device, &rp_info, NULL, &info.render_pass);
You'll be using the render pass in several of the upcoming samples.
Descriptor Set | Index | Shaders |