In my previous post I provided an introduction to the heap data structure and described how one of its main usages is in building a priority queue, such as the .NET type `PriorityQueue`

. In this post, I look at the implementation of `PriorityQueue`

, understand how it uses the underlying array as a heap data structure, and discuss the complexity of the implementation.

## Background: The heap data structure and the priority queue

In my previous post I described the heap data structure. The heap is typically represented as a tree that satisfies the *heap property*. For a min-heap, this property states: for every node in the tree, the node's value is less than (or equal to) the value of its children.

The heap data structure is often implemented using an array, in which the relationship between nodes is encoded in the *location* of nodes in the array. For example, the following shows a min-heap array, and how this could be represented on disk as an array.

The heap data structure is the common implementation approach for a *priority queue*. A priority queue allows you to enqueue elements in any order, but dequeue them in order of minimum priority (or maximum priority, if using a max-heap).

The `PriorityQueue`

type introduced in .NET 6 uses an array-backed min-heap in its implementation. However, rather than a binary heap data structure shown in the image above, `PriorityQueue`

uses a d-ary with arity 4 (i.e. a quaternary) heap, representing a structure similar to the following:

At the end of the previous post, I described the various methods available on the `PriorityQueue`

type. For the rest of this post we'll look at how some of these methods are implemented, how they map to standard heap operations, and the big *O* complexity of their implementation.

## Looking inside the `PriorityQueue`

implementation

The `PriorityQueue`

type has over 1,000 lines of code in the latest .NET 9 preview, so I'm not going to be reviewing *all* that code in this post. Instead I focus on some of the most interesting methods that relate to standard heap operations. We'll start by understanding the basics of the `PriorityQueue`

, then we'll look closer at the underlying array, and finally we'll look at how `PriorityQueue`

implements some of the standard heap operations.

### The basics of `PriorityQueue`

The first thing to appreciate is that `PriorityQueue<TElement, TPriority>`

is a generic type with two generic parameters:

`TPriority`

—The priority of an element. The priority is the only part the heap data structure "cares" about; ordering and all operations are based on this value.`TElement`

—The element associated with each priority. This is generally going to be the data*you*care about.

With that in mind, at the heart of the `PriorityQueue`

are three fields:

```
/// <summary>
/// Represents an implicit heap-ordered complete d-ary tree, stored as an array.
/// </summary>
private (TElement Element, TPriority Priority)[] _nodes;
/// <summary>
/// Custom comparer used to order the heap.
/// </summary>
private readonly IComparer<TPriority>? _comparer;
/// <summary>
/// The number of nodes in the heap.
/// </summary>
private int _size;
```

`_nodes`

is an array of element-priority pairs. This is the heap data structure.`_comparer`

is an optional custom comparer for comparing`TPriority`

. If you don't provide a custom comparer (and`TPriority`

is a value type), then`PriorityQueue`

uses an optimised code path.`_size`

is the number of elements stored in the`_nodes`

heap. Note that`_nodes.Length >= _size`

, because the`_nodes`

array includes "space" for adding new elements, to avoid needing to resize the array too often (which is relatively expensive).

In addition to these fields, there are a couple of fields (and nested types) that are used to provide access to the underlying heap without going through the standard `Peek`

or `Dequeue`

methods. For most "normal" operations you won't need to access the nodes directly like this, so I'm going to largely ignore them in this post:

```
/// <summary>
/// Lazily-initialized collection used to expose the contents of the queue.
/// </summary>
private UnorderedItemsCollection? _unorderedItems;
/// <summary>
/// Version updated on mutation to help validate enumerators operate on a consistent state.
/// </summary>
private int _version;
```

Finally, we have a couple of constants that define the key characteristics of the heap:

```
/// <summary>
/// Specifies the arity of the d-ary heap, which here is quaternary.
/// It is assumed that this value is a power of 2.
/// </summary>
private const int Arity = 4;
/// <summary>
/// The binary logarithm of <see cref="Arity" />.
/// </summary>
private const int Log2Arity = 2;
```

In case it's not clear, `Log2Arity`

is ${log}_{2}$ of `Arity`

.

That gives us all the basics of the `PriorityQueue`

. Before we start looking at the operations available, we'll look a little closer at the heap data structure.

### Understanding element relationships based on array index

In the previous post I showed a diagram of how a quaternary heap is laid out in the heap array. The following shows a similar diagram, reproduced below with array indices

Looking at the diagram above, you may be able to get a sense of the relationship between a child node and the parent node. First of all, you can see that the root node is always in index `0`

. You can find the parent index, $p$, of a node $i$, using the formula: $p=\frac{i-1}{d}$ (for any node where $i>0$, i.e. any node except the root node). In the `PriorityHeap`

type you can see that formula encoded in the `GetParentIndex()`

method:

```
/// <summary>
/// Gets the index of an element's parent.
/// </summary>
private static int GetParentIndex(int index) => (index - 1) >> Log2Arity;
```

This code uses the fact that the `PriorityQueue`

heap is optimised for arity values that are powers of 2. Bit-shifting to the right by 1 (`>> 1`

) is the same as dividing by 2, so bit shifting by ${log}_{2}d$ is the same as dividing by $d$.

Similarly, you can find the indices of the first child node, $c$, using the formula: $c=di+1$. This formula is again encoded in `PriorityHeap`

as the `GetFirstChildIndex()`

method:

```
/// <summary>
/// Gets the index of the first child of an element.
/// </summary>
private static int GetFirstChildIndex(int index) => (index << Log2Arity) + 1;
```

Analogous to the dividing code, this code uses left bit-shifting (`<<`

) to multiply by 2.

With the node positions in the array in mind, we'll now take a look at some of the common priority queue operations.

### Dequeuing nodes

One of the simplest operations on a min-heap is `find-min`

, in which you retrieve the minimum value from the heap, but don't remove it. In `PriorityQueue`

, that's implemented in the `Peek`

method, and it's about as simple as you might expect:

```
public TElement Peek()
{
if (_size == 0)
{
// if the queue is empty, there is no root element
throw new InvalidOperationException();
}
return _nodes[0].Element; // return the root element
}
```

The complexity of this method is evidentially $O\left(1\right)$ because it doesn't depend on the size of the array; it's always returned in constant time.

That was the easy one, now let's look at actually *removing* the root node. When you add or remove a node you need to "rebalance" the heap to make sure it still satisfies the heap property. The following code shows the outline of the `Dequeue()`

method (commonly called `delete-min`

on a heap). I've inlined some methods, removed some minutiae, and added some comments below:

```
public TElement Dequeue()
{
if (_size == 0)
{
// Can't dequeue if the queue is empty
throw new InvalidOperationException();
}
// The element to return once the heap is rebalanced
TElement element = _nodes[0].Element;
// Decreate the size by one, and grab the index of the last node
int lastNodeIndex = --_size;
// If lastNodeIndex == 0, then there's only one node left, so
// there's no more work to do
if (lastNodeIndex > 0)
{
// Grab the last node, effectively swap it into the "root"
// node location, and then do a "sift-down" operation to
// rebalance the tree and move the node to the correct location
(TElement Element, TPriority Priority) lastNode = _nodes[lastNodeIndex];
if (_comparer == null)
{
// Use the default comparer to do a "sift-down" (shown shortly)
MoveDownDefaultComparer(lastNode, 0);
}
else
{
// Use the custom comparer to do a "sift-down"
MoveDownCustomComparer(lastNode, 0);
}
}
return element;
}
```

The above code shows that the root node is removed and then the *final* node is moved into its place, before rebalancing by moving the node back to the correct location, as you'll see shortly.

You can also see that there are two methods for performing the "sift-down" operation: one that uses the default comparer (when `_comparer == null`

) and one that uses a customer comparer. The code paths are essentially identical, but the JIT just does some extra optimizations with the default comparer. For simplicity, I'm only going to show methods that use the default comparer for the rest of the post.

The following code shows how the `MoveDownDefaultComparer`

effectively swaps nodes until it finds the correct location for the new node.

```
private void MoveDownDefaultComparer((TElement Element, TPriority Priority) node, int nodeIndex)
{
// The node to move down will not actually be swapped every time.
// Rather, values on the affected path will be moved up, thus leaving a free spot
// for this value to drop in. Similar optimization as in the insertion sort.
(TElement Element, TPriority Priority)[] nodes = _nodes;
int size = _size;
int i;
// using the formula described previously, step down to the first child
while ((i = GetFirstChildIndex(nodeIndex)) < size)
{
// loop through all the "Arity" number of potential children
// and find the node with the smallest priority
(TElement Element, TPriority Priority) minChild = nodes[i];
int minChildIndex = i;
int childIndexUpperBound = Math.Min(i + Arity, size);
while (++i < childIndexUpperBound)
{
(TElement Element, TPriority Priority) nextChild = nodes[i];
if (Comparer<TPriority>.Default.Compare(nextChild.Priority, minChild.Priority) < 0)
{
// this is the smallest of the child nodes under the parent
minChild = nextChild;
minChildIndex = i;
}
}
// If the node we're trying to insert has a smaller priority than
// the priority of the smallest child nodes, then the
// heap property is satisfied; insert node in this location.
if (Comparer<TPriority>.Default.Compare(node.Priority, minChild.Priority) <= 0)
{
// we don't need to search any more
break;
}
// Move the minimal child up by one node and
// continue recursively from its location.
nodes[nodeIndex] = minChild;
nodeIndex = minChildIndex;
}
// we've found the final location for the node, insert it here
nodes[nodeIndex] = node;
}
```

The execution of this method may be tricky to visualize, so the following diagram shows the execution of the algorithm and how the nodes are moved.

If you work through this algorithm, working out the number of comparisons given the number of nodes, $n$, and the arity of the heap, $d$, you can find that the complexity of `delete-min`

is $O\left(d\frac{logn}{logd}\right)$. For the `PriorityQueue`

, where $d=4$, this simplifies to $O\left(2logn\right)$.

In the final section of this post, we'll look at the opposite operation, inserting a new node.

### Enqueuing nodes

When inserting a new node, you could imagine two main approaches:

- Insert the new node at the root, and move the displaced root node down
- Insert the new node at the end, and move it up as required

The typical approach is the latter, and that's what the `Enqueue`

method of `PriorityQueue`

does, as shown below:

```
public void Enqueue(TElement element, TPriority priority)
{
// Virtually add the node at the end of the underlying array.
// Note that the node being enqueued does not need to be physically placed
// there at this point, as such an assignment would be redundant.
int currentSize = _size;
// increase the size of the array if required
if (_nodes.Length == currentSize)
{
Grow(currentSize + 1);
}
_size = currentSize + 1;
// node location, and then do a "sift-up" operation to
// rebalance the tree and move the node to the correct location in the heap
if (_comparer == null)
{
// Use the default comparer to do a "sift-up" (shown shortly)
MoveUpDefaultComparer((element, priority), currentSize);
}
else
{
// Use the custom comparer to do a "sift-up"
MoveUpCustomComparer((element, priority), currentSize);
}
}
```

The above code shows that:

- The heap ensures the array is sufficiently large to hold the new node
- The node is "virtually" placed at the end
- The
`MoveUpDefaultComparer`

(or custom comparer version) moves the node to the correct location.

The `MoveUpDefaultComparer`

method, called `sift-up`

when operating on a heap, is shown below

```
private void MoveUpDefaultComparer((TElement Element, TPriority Priority) node, int nodeIndex)
{
// Instead of swapping items all the way to the root, we will perform
// a similar optimization as in the insertion sort.
(TElement Element, TPriority Priority)[] nodes = _nodes;
while (nodeIndex > 0)
{
// Get the parent node of the current index
int parentIndex = GetParentIndex(nodeIndex);
(TElement Element, TPriority Priority) parent = nodes[parentIndex];
// if the parent node priority is smaller than this one....
if (Comparer<TPriority>.Default.Compare(node.Priority, parent.Priority) < 0)
{
// ...then move the parent down to the child slot
nodes[nodeIndex] = parent;
// ...and try again
nodeIndex = parentIndex;
}
else
{
// ...otherwise, we've found the correct spot.
break;
}
}
// Finally, put the node in the correct slot
nodes[nodeIndex] = node;
}
```

As before, the following diagram shows how this plays out in an example:

The sift-up (`Enqueue`

) algorithm is somewhat simpler than the sift-down (`Dequeue`

) algorithm, because you only need to compare against each parent, not against all the siblings. That leads to a complexity of $O\left(\frac{logn}{logd}\right)$, which for the `PriorityQueue`

, where $d=4$, simplifies to $O\left(\frac{logn}{2}\right)$ .

We've covered the most important operations on an established heap, so the final operation we'll look at in this post is creating the heap out of the initial set of unordered elements.

### Creating the initial heap

The process of converting an unsorted array of elements into a heap is sometimes called `heapify`

, and in the .NET `PriorityQueue`

the method is similarly called `Heapify()`

. It's called as part of the `PriorityQueue`

constructors and when you call `EnqueueRange()`

if the queue is currently empty. For example:

```
public PriorityQueue(IEnumerable<(TElement Element, TPriority Priority)> items, IComparer<TPriority>? comparer)
{
_nodes = EnumerableHelpers.ToArray(items, out _size);
_comparer = InitializeComparer(comparer);
if (_size > 1)
{
Heapify();
}
}
```

The `Heapify()`

method itself performs successive `sift-down`

operations to create a valid heap out of the unsorted elements. Note that creating the heap doesn't require moving *all* the elements—it just has to ensure that the parent nodes respect the heap property—so heapify performs a `sift-down`

procedure on each of the nodes that have children.

```
private void Heapify()
{
// Leaves of the tree are in fact 1-element heaps, for which there
// is no need to correct them. The heap property needs to be restored
// only for higher nodes, starting from the first node that has children.
// It is the parent of the very last element in the array.
(TElement Element, TPriority Priority)[] nodes = _nodes;
int lastParentWithChildren = GetParentIndex(_size - 1);
if (_comparer == null)
{
for (int index = lastParentWithChildren; index >= 0; --index)
{
MoveDownDefaultComparer(nodes[index], index);
}
}
else
{
for (int index = lastParentWithChildren; index >= 0; --index)
{
MoveDownCustomComparer(nodes[index], index);
}
}
}
```

We've already looked at the `MoveDownDefaultComparer()`

method as part of the `Dequeue()`

function, so the following diagram shows how `Heapify`

builds a heap out of the unsorted array of elements:

Considering the complexity of the heapify operation is a little involved, but ultimately down to be roughly $O\left(n\right)$.

We've now covered `Peek()`

, `Enqueue()`

, `Dequeue()`

and `Heapify()`

, so we've looked at most of the important methods on `PriorityQueue`

. Most of the other methods available use variations of these implementations, so I'm not going to dig into any more details in this post. In the next post, rather than looking at the implementation of `PriorityQueue`

, we'll look at how you can use it in other tasks.

## Summary

In this post I looked at how the `PriorityQueue`

type that was introduced in .NET 6 is implemented. I showed how it uses a heap (an array where the location of elements indicates their relations) to implement its functionality. We looked in detail at the `Peek()`

, `Enqueue()`

, and `Dequeue()`

methods, provided an example of each method executing visually, and discussed the Big *O* complexity of the methods. Finally, we looked at how `PriorityQueue`

creates a heap from an unsorted collection of elements, implemented as the `Heapify()`

method, which is called when the queue is first created or when `EnqueueRange()`

is called on an empty queue. In the next post, we'll look at how you can use `PriorityQueue`

in other tasks.