Queues
An application submits work to a VkQueue, normally in the form of VkCommandBuffer objects or sparse bindings.
Command buffers submitted to a VkQueue start in order, but are allowed to proceed independently after that and complete out of order.
Command buffers submitted to different queues are unordered relative to each other unless you explicitly synchronize them with a VkSemaphore.
You can only submit work to a VkQueue from one thread at a time, but different threads can submit work to a different VkQueue simultaneously.
How a VkQueue is mapped to the underlying hardware is implementation-defined. Some implementations will have multiple hardware queues and submitting work to multiple VkQueues will proceed independently and concurrently. Some implementations will do scheduling at a kernel driver level before submitting work to the hardware. There is no current way in Vulkan to expose the exact details how each VkQueue is mapped.
|
Not all applications will require or benefit from multiple queues. It is reasonable for an application to have a single “universal” graphics supported queue to submit all the work to the GPU. |
Queue Family
There are various types of operations a VkQueue can support. A “Queue Family” just describes a set of VkQueues that have common properties and support the same functionality, as advertised in VkQueueFamilyProperties.
The following are the queue operations found in VkQueueFlagBits:
-
VK_QUEUE_GRAPHICS_BITused forvkCmdDraw*and graphic pipeline commands. -
VK_QUEUE_COMPUTE_BITused forvkCmdDispatch*andvkCmdTraceRays*and compute pipeline related commands. -
VK_QUEUE_TRANSFER_BITused for all transfer commands.-
VK_PIPELINE_STAGE_TRANSFER_BIT in the Spec describes “transfer commands”.
-
Queue Families with only
VK_QUEUE_TRANSFER_BITare usually for using DMA to asynchronously transfer data between host and device memory on discrete GPUs, so transfers can be done concurrently with independent graphics/compute operations. -
VK_QUEUE_GRAPHICS_BITandVK_QUEUE_COMPUTE_BITcan always implicitly acceptVK_QUEUE_TRANSFER_BITcommands.
-
-
VK_QUEUE_SPARSE_BINDING_BITused for binding sparse resources to memory withvkQueueBindSparse. -
VK_QUEUE_PROTECTED_BITused for protected memory. -
VK_QUEUE_VIDEO_DECODE_BIT_KHRandVK_QUEUE_VIDEO_ENCODE_BIT_KHRused with Vulkan Video.
Knowing which Queue Family is needed
Each operation in the Vulkan Spec has a “Supported Queue Types” section generated from the vk.xml file. The following is 3 different examples of what it looks like in the Spec:
Querying for Queue Family
The following is the simplest logic needed if an application only wants a single graphics VkQueue
uint32_t count = 0;
vkGetPhysicalDeviceQueueFamilyProperties(physicalDevice, &count, nullptr);
std::vector<VkQueueFamilyProperties> properties(count);
vkGetPhysicalDeviceQueueFamilyProperties(physicalDevice, &count, properties.data());
// Vulkan requires an implementation to expose at least 1 queue family with graphics
uint32_t graphicsQueueFamilyIndex;
for (uint32_t i = 0; i < count; i++) {
if ((properties[i].queueFlags & VK_QUEUE_GRAPHICS_BIT) != 0) {
// This Queue Family support graphics
graphicsQueueFamilyIndex = i;
break;
}
}
Creating and getting a Queue
Unlike other handles such as VkDevice, VkBuffer, VkDeviceMemory, there is no vkCreateQueue or vkAllocateQueue. Instead, the driver is in charge of creating and destroying the VkQueue handles during vkCreateDevice/vkDestroyDevice time.
The following examples will use the hypothetical implementation which support 3 VkQueues from 2 Queue Families:
The following is an example how to create all 3 VkQueues with the logical device:
VkDeviceQueueCreateInfo queueCreateInfo[2];
queueCreateInfo[0].queueFamilyIndex = 0; // Transfer
queueCreateInfo[0].queueCount = 1;
queueCreateInfo[1].queueFamilyIndex = 1; // Graphics
queueCreateInfo[1].queueCount = 2;
VkDeviceCreateInfo deviceCreateInfo = {};
deviceCreateInfo.pQueueCreateInfos = queueCreateInfo;
deviceCreateInfo.queueCreateInfoCount = 2;
vkCreateDevice(physicalDevice, &deviceCreateInfo, nullptr, &device);
After creating the VkDevice the application can use vkGetDeviceQueue to get the VkQueue handles
VkQueue graphicsQueue0 = VK_NULL_HANDLE;
VkQueue graphicsQueue1 = VK_NULL_HANDLE;
VkQueue transferQueue0 = VK_NULL_HANDLE;
// Can be obtained in any order
vkGetDeviceQueue(device, 0, 0, &transferQueue0); // family 0 - queue 0
vkGetDeviceQueue(device, 1, 1, &graphicsQueue1); // family 1 - queue 1
vkGetDeviceQueue(device, 1, 0, &graphicsQueue0); // family 1 - queue 0